February 27, 2004

XAMLON: XAML before Longhorn

I am amazed by all the early activity going on around Longhorn in the blogs I read, all the while Whidbey's release is still at least a year from now...
Anyways, waiting until Longhorn to benefit from declarative UI development seems like a long time. XAMLON is a library that attempts to bridge the gap by bringing a subset of XAML into the 1.1 .NET world.
Its approach is a bit different from Avalon's, as the XAML isn't compiled, but instead interpreted at runtime, and as a consequence the markup can't contain logic code. My guess is that's because the 1.1 framework doesn't support partial classes.

There are some other attempts at describing .NET UI with XML, like UISerializer and XML Forms.

In the non-.NET world, there is Glade XML (for GTK), XUL (used in Mozilla and some Java projects like Thinlets or Luxor XUL) and Swing XML (SwiXml).

Update: Andrew linked to another XAML rendering engine.
Miguel on XAML and other PDC topics.
XUL# and XBL#.

Update:
Beyond HTML: A Look at Markup Languages For Creating Rich "Classic Desktop-Style" UIs.

Posted by Julien at 01:16 PM | Comments (1)

February 26, 2004

Classical Computer Science Texts

A list of classical computer science texts. Lots of good reading.
Amongst others, it links to "The Anatomy of a Large-Scale Hypertextual Web Search Engine" from the Google founders, "Go To Statement Considered Harmful" from Dijkstra, and Knuth's "Computer Programming as an Art".

The link for Bayesian Networks without Tears was broken, but it wasn't much of a challenge for Google.

Posted by Julien at 12:22 AM | Comments (0)

February 22, 2004

What do computers dream about?

Electric Sheep is a poetic screen saver that uses the distributed computation to visualize the "collective dream of sleeping computers" (ie. cool looking fractal animations).
You can get some sample movies from the website, as well as a documentary with artistic and technical explanation.
Scott Draves's article gives an interesting overview of how the sheeps are born, bred and evolved.

Another interesting kind of mutations: animating walking characters via evolution.

Update: A DVD was made from Electric Sheep animations. You can get it online from Spotworks.

Posted by Julien at 09:35 AM | Comments (0)

February 18, 2004

Ubiquitous data product: DataPod

DataPod seems very promising. It is a peer to peer synchronizing folder, oriented towards businesses.
It is still in beta stage and was presented at DEMO 2004.

Here is another short description. It mentions multiple subscriptions options (monthly, yearly and lifetime).
On the other hand, this other article reports a fixed 70-100$ price, and the discussion thread lists some other folder synchronization products: iFolder and FusionOne.

Posted by Julien at 12:34 PM | Comments (2)

February 13, 2004

M&Ms and sphere packing

Experiments and simulations have found M&Ms to have a very good packing density. They achieve a 71% density in random arrangements, compared to 64% for spheres (and 74% for regularly stacked spheres).

You've got to love M&Ms ;-)

Links:
Sphere packing problem (Wikipedia).
Sphere packing problem (Mathworld).

Update:
Roland Piquepaille points out "A team of computer scientists at the Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) has just developed Contpack, a 3D software to solve this problem and to maximize the volume utilization of containers".

Posted by Julien at 04:26 PM | Comments (0)

February 12, 2004

Essential utility: "screen"

Screen is a command line utility that comes with lots of Unix variants (including Linux). It is a bit hard to describe and to discover. It provides two main features: multiplexing a terminal and detaching/re-attaching sessions.

I originally got shown this by my "System programming" TA, when I asked him if I could disconnect X11 application from one terminal and have them re-open (in the same state) onto another terminal. It turns out this is possible for X11 applications (see xmove or xmx, proxy-based redirections, or guiEvict <via sweetcode>, described in this paper). But "screen" although limited to text applications is so useful that I dropped my search for its X11 equivalent.

Multiplexing: If you start a remote shell (say SSH) but want to do more than one thing at a time, you may want to open more sessions, but you can also use screen. It allows you to switch between multiple shells on the same connection.
You need to first run "screen", then use "C-A C-C" (control A, control C) to create a new shell, and you now can switch between shells with "C-A C-*" where * is the number for you shell (for ex, use "C-A C-1" to switch to shell 1).

Session migration: Let's say you need to disconnect from that session, but you want to continue it from another computer. You can use "C-A C-D" to detach screen. All the shells will be maintained, and you can reconnect to the remote machine again and type "screen -R" to re-attach to the original session. This is also very useful in can the connection is lost by accident.

Update: a similar story (with more details) was just posted on kuro5hin.

Posted by Julien at 03:05 PM | Comments (0)

February 10, 2004

Information wants to be decentralized

One of the top limitations with computers today (along with search) is the problem of synchronized information between multiple machines, described earlier in "Private Information Network".
I read about other people reporting the same need all the time. Just in the last couple of weeks, Richard mentioned synchronizing bookmarks, Jon Udell wrote about keeping the devices synchronized, so they are interchangeable, Wired listed Make networked home PCs back each other up in its "101 Ways to Save the Internet" list and Tim discussed various solutions to keep feeds and mail synchronized.

In the past, I have relied on a server based approach to handle a subset of my information: bookmarks (using a custom bookmark manager + a bookmarklet), RSS feeds (using bloglines) and email (using Outlook Web Access). But this feels quite limited and brittle, because it is web-based and centralized.

Local vs. web-based review:
Local
+ fast/rich UI, control
- access, backup

Central/hosted
+ access, setup, aggregation of data across multiple users (PageRank, recommendations, ...)
- slower/simpler UI (likely web-based), dependency on service provider, cost, backup, resource efficiency

Central/personal
+ access, control, resource efficiency
- slower/simpler UI, setup, maintenance, backup

List of the dimensions considered:
access: having access to your information from anywhere,
UI: responsive/rich vs. slower/web-based,
control: choice and customization of the software,
backup: risk of loosing the data (hard-drive crash, service provider goes bankrupt),
setup & maintenance: hassle and technical difficulty to set up and maintain,
resource efficiency & cost: exploits mainly un-used resources vs. new infrastructure is needed.


The major trade-off is getting a slower, web UI in exchange for better access. But why isn't there a way to get both most of the time?
Also, even the centralized solutions carry risks of losing the data, as it is still in only one location.

Decentralized/personal:
Why not move toward a decentralized application model, where the data is kept on multiple machines, accessible and secure?
For example, my various machines (home, work, laptop, PDA,...) would stay connected via a private P2P network, which would allow local caching and remote synchronization of the data. Some form of web access could be also be useful, in the case you want to access some information from a friend's machine.

This model is based on assumptions: hard drive space is cheap and lots of it is wasted, the most important/valuable information isn't very big, and the network is fast and often un-used at its full capacity.

Pluses:
+ access: your information is available either by replication to most of your machines, on the fly caching or web access,
+ fast, rich UI: the data is cached locally for rich interaction,
+ control: you can pick and switch applications easily,
+ backup: the distributed replication ensures a natural backup,
+ resource efficiency: local caching helps limit the network usage, wasted hard drive space is used,

Minuses:
- replication time: if the machine with the latest version of an item didn't replicate before going offline, you only have access to an older version,
- setup: machines must be added to the private P2P network one by one, replication configuration can be tricky,
- synchronization conflicts: because there is multiple copies of the same piece of data, any changes made offline need to be merged and sometimes manual intervention is needed to solve conflicts,
- resource efficiency: data replication isn't the most efficient use of storage, synchronization of a number of machines isn't network efficient either.

Recommendations for a framework:
P2P connectivity: some machines may be behind NATs or firewalls. Using a P2P topology helps restore connectivity between the machines of the private network.
API based synchronization: file-level synchronization makes it difficult to handle conflicts. If applications run on top of a changeset management API then it should be easier. But conflict resolution still seems like the toughest problem in such a decentralized architecture.
Synchronous/asynchronous replication: in some cases replication can occur on-demand, when I request a local copy of a file, but it can also happen in the background, either at a schedule or when the machine is idle.
Network efficient: not all data should be mirrored to all machines. Data should be transferred as directly as possible between machines.
Secure: information stored in the private network needs to be secured against un-authorized access. This may be extended in the future to support file sharing between private machine networks (say friend to friend).
Web bridging: a fall back solution should be provided to access from a machine outside of the network, via a web interface and maybe an applet.
Streaming support: why transfer a DivX to a local machine before viewing it? If the network supports it, the player should stream the media over the private network.
Metadata replication: do you really need to replicate all your MP3s? But you still might want to back them up one way or another. Replicating the metadata (file hash, filename, MP3 title,...) is enough to recover the content from the internet for some kind of files.

Links:
InterMezzo filesystem.
Synchronization of Information Aggregators using Markup (SIAM) and the challenges of synching..
WinFS synchronization.
StreamAgent.
The Dangers of Replication, and a Solution (by Jim Gray and others).
A list of version control systems (darcs and monotone seem really interesting, since they seem to manage decentralized merges).
Weblications: a great summary of the evolution of applications toward the web. Google has made pretty responsive and rich web UIs based on the centralized model. I still believe a decentralized model with caching will appear and disrupt these large service providers, but we're not there yet.

Posted by Julien at 10:50 AM | Comments (3)

February 04, 2004

Video streaming over HTTP

My previous roommate and I used to run into this problem a lot: we'd want to watch a movie in one room, but the file would be on another computer.
This is a general problem that I mentioned before in "Private Information Network".

But we now have a partial solution to this video/DivX streaming problem. It uses the super cool™ streaming-over-HTTP feature of the VideoLAN client and a custom HTTP proxy.

VideoLAN:
I had heard about the MPEG-2 (DVD) streaming capabilities of this software before, but only recently found out about the DivX support. To use it, you just need to put the DivX file on an HTTP1.1 web server and use the url in VLC (VideoLAN client).
It supports jumping to a specific part of the file, but it takes a few seconds (until a keyframe is reached, I guess) for the complete image to be refreshed to the new location in the stream.

StreamAgent:
The idea of StreamAgent is to have a local HTTP agent map all the servers in a group together and proxy the requests to the appropriate server.
For example, on machine1 (running agent1), you could use http://localhost:<StreamAgentPort>/agent2/GetFile?file=test.avi to stream test.avi from machine2 (running agent2).

Our tests on a wired LAN were successful and the video streamed smoothly. But usability problems made it a pain to actually use, because inter-agent module wasn't flexible enough (use of IP addresses, agents have to be started in a specific order,...) and we only had basic content discovery/browsing.

In terms of proxying the HTTP requests, supporting the Content-Range header is key for VLC to download chunks of the file.
Parsing the HTTP request and response was more trouble though. The standard IO packages from the .Net framework wouldn't let us parse mixed streams (text and binary data), and switching stream objects on the fly didn't work because of buffering issues.

The inter-agent communication is supposed to use a P2P transport, to ensure "connectivity" between agents that are separated by NATs. It would also handle the naming the agents within the group (most home machines aren't registered in the DNS). But our initial implementation only supports two machines that can directly connect to each other by IP, because we couldn't find a P2P framework for use in C#.
Microsoft's P2P SDK doesn't come with managed wrappers, and doesn't support traversing IPv4 NATs.
JXTA seems pretty suitable from what I read (restores end-to-end connectivity as much as possible, secures the peer group,...), but there is any .Net implementation.

Future:
I'm thinking about learning some more about JXTA and attempting a Java implementation of the StreamAgent. Let me know if you'd be interested.
The current implementation isn't online, as I don't have a running CVS server for now, but I still have the CVS archive. Also, it is fairly limited anyway, since the inter-agent communication component is the most trivial possible and most of the code is HTTP proxying...

An interesting extension would be to offer a programmatic API to a running StreamAgent (rather than thru HTTP). This could be used to allow applications to replicate/sync data accross the group, like bookmarks for example. It could also be used to provide remote file access with local caching (see Intermezzo).

Posted by Julien at 01:55 PM | Comments (3)