Curiosity is bliss · Julien Couvreur's programming blog and more

Curiosity is bliss    Archive    Feed    About

Julien Couvreur's programming blog and more

Simon for Particle Photon

I received a Particle Photon this week-end. It’s an amazing little device, and cheap too (19$ without the shield). It has a number of digital I/O ports and one analog, as well as built-in wi-fi (for deploying your code, receiving commands, publishing data/notifications and other TCP communications) and serial communication through USB.

My first project with the Photon paired with a Internet Button shield (11 LEDs, 4 buttons, an accelerometer and more) was a classic toy game, Simon. It was a natural fit and required no additional hardware or assembly, just software. And it’s still surprisingly fun to play.
The code is available for anyone to use in Particle’s web-based IDE (look for “Simon” in community libraries) and also on github.

Simon on Particle Photon

Beyond that, some ideas I have so far: building a self-balancing mini-Segway with servo motors, or using it as an USB extension for mobile devices (but the USB host capability is not yet built into the firmware), or driving an LCD display, or simply using it as an IR remote. I also noticed some cool existing Photon projects, such as a sous-vide cooker and a streaming internet microphone.

The original Simon:

OriginalSimon.jpg

Git Internals

I started to use Git more regularly and was curious about its design. Pro Git is an excellent and free book on using and understanding Git. I’ll share some minimalist notes I took on Git’s internal design. The design is simple and elegant. It’s been very enjoyable to learn about.

Object model

The Git object model has three types: blobs (for files), trees (for folder) and commits. Objects are immutable (they are added but not changed or removed) and every object is identified by its unique SHA-1 hash.
A blob is just the contents of a file. By default, every new version of a file gets a new blob, which is a snapshot of the file (not a delta like many other versioning systems).
A tree is a list of references to blobs and trees.
A commit is a reference to a tree, a reference to parent commit(s) and some decoration (message, author).

Then there are branches and tags, which are typically just references to commits.

This illustration (borrowed from Pro Git) shows branches, commits, trees and blobs and their relationships: git-data-model-4.png

High-level Git commands (init, add, commit) are the most common way of manipulating the object model (and its underlying stored representation), but Git also offers a number of low-level commands (for instance to create a blob object).

Storage format

Let’s move on to the physical storage of those objects. All the repository data is stored in the .git folder, which has the following structure:

objects/ 
refs/ 
HEAD 
config
description
hooks/ 
info/ 

We’ll cover objects, refs and HEAD.

Objects folder

The .git/objects folder looks like this:

<SHA-1 named files>
pack file and index file

All objects are store in this folder, using their SHA-1 identifier as filename (the first two characters of the identifier are used as sub-folder). The objects can optionally be packed, in which case they get moved into a pack file, which comes with an index file.

As we’ve seen, each type of object holds different kind of information: * blob (contents of a file) * tree (list of filenames, each with a SHA-1 reference and an object type, which can be normal, executable, symbolic link or directory) * commit (reference to toplevel tree, author information and commit message)

Each object type has a specific serialization to file. For instance blob objects are serialized as “blob <space> <content length> \0 <content> <linefeed>” which is then compressed with zlib.

Pack file

As you commit multiple versions of a file, the objects folder grows and contains a lot of duplication. A git command allows to pack the objects. This creates an index file and a pack data file.
The index is a list of SHA-1 object identifiers that got packed, and for each, some information for finding the object in the pack data file.

The data can be stored in different ways in the data file (either a snapshot or a delta), so depending on the case the index row will contain different information:
1. Simple or snapshot index entries have an identifier, object type, object size, and start/end offsets for finding the blob in the pack data file.
2. Delta index entries have an identifier, object type, the SHA-1 identifier of the baseline object, and start/end offsets for finding the delta blob in the pack data file.

When git packs the objects, it decides which objects to keep as snapshot and which to keep as delta.

references folder

The .git/refs folder looks like this: refs/heads refs/tags refs/remotes

All the objects we have stored so far can only be accessed if you know their identifier. The branches and tags are ways to keep a handle on a few of those identifiers, by giving them a friendlier name and allowing to enumerate them.

heads contains files named after branches. Each holds the SHA-1 reference of a commit object.
tags contains files named after tags. Each contains the SHA-1 reference of a commit object (for simple tags, without annotations).
Finally, the file .git/HEAD contains the pathname to the head file (for instance refs/heads/master) which you currently have checked out.

Ad blocking on iOS

Mobile browsers don’t support extensions like their desktop counterparts and most don’t have an ad blocker built-in. But it turns out that iOS (and probably Android and Windows Phone) supports good old “proxy auto-configuration” (PAC).
PAC is a mechanism by which the operating system uses a simple script file to choose when to use a proxy. The script receives the host and url of each web request and tells the operating system whether to connect directly (as normal) or instead use a proxy.
The trick is to use a blackhole proxy (which returns no content) for urls that are recognized as advertisement, based on a list of known domains and url patterns.

So I dug up an existing ad-blocking PAC file which seems somehow up-to-date (no-ads version 5.125 from John LoVerso), configured the blackhole proxy to Google’s DNS server (8.8.8.8 port 53), and updated my iOS wi-fi settings to point to it. I tested in Chrome on iPhone and iPad and this method seems to work.

You can try this solution by following the instructions below, but please read the security considerations below first.
You should note that PAC only works for wi-fi in iOS, not on cellular or other connections.
Also, you should know that iOS 9 may have official support for ad blocking extensions. The details are not yet known.

How to install

On iOS, go to Settings > Wi-Fi and open the configuration for the wi-fi you’re connected to. At the bottom, switch the HTTP proxy option to “Auto” and copy and paste http://blog.monstuff.com/ad-block-pac.js into the box.

PAC configuration in iOS

Security considerations

Configuring a PAC file into your operating system can be dangerous. If the PAC file is adversarial or was modified by a hacker, the attacker could send parts or all of your web traffic through a proxy of his choice.

What is typically recommended is for you to use your own copy of this file (you still have to host your copy securely).

The way I’m looking to solve this is to host the PAC file on a trusted CDN of immutable files. But I have not yet found an appropriate CDN.
This will allow you to review the contents of the PAC file you choose (it’s easy to check the code to see it only points to Google’s DNS servers as blackhole proxies) and have peace of mind that it cannot be surreptitiously updated.
On the other hand, this means you’ll have to update your OS settings if you want to use a newer version of the file.

Another approach I’m going to investigate to solve this security problem is trying to host the PAC file on the device itself. This would mean installing an iOS app containing a PAC file and referencing that file from the network settings of the OS. I’ll post an update once I try.

Any other ideas are appreciated.

Using Google DNS as blackhole

The idea of using Google DNS servers comes from the FAQ of Weblock, an iOS app which generates PAC files. The FAQ offers a good explanation for this choice:

  1. iOS requires dummy proxy to be a valid IP address accepting connections (so it’s not possible to use local IP address of your device, since there is no open port to connect to).
  2. It’s really responsive, fast and stable anywhere in the world.
  3. It’s NOT ABLE to handle HTTP/HTTPS traffic, since it’s a DNS server (it handles an entirely different protocol). It immediately closes the HTTP/HTTPS connection (which is perfect!).
  4. It’s widely recognized and well known IP, so you don’t have to be concerned about your privacy. We’re quite sure Google is not logging all web connection attempts made while blocking content from your device, since this dummy proxy is actually a DNS server supporting a different kind of requests.

Weblock also hosts some PAC files. Here’s a few I’ve seen referenced: http://wl.is/zXsGpP.js, http://wl.is/EA9Ina.js and http://wl.is/KT9Ugo.js.