It would be unfeasible to do so. Assuming the displayed image is a reasonable size, say a dimension of 320x240, you would run out of pixel space in a jpeg due to dimension limitations after about 38 minutes @ 24fps. Granted, that's not using the manifest, just straight skipping around in the sprite.
To make this anywhere near possible on a large scale would be feasible though. The easiest way would to be create the manifest while encoding the video. Take the same compression scheme but save the encoding diffs for every frame.
Apple's manifest supports spreading the video across several images (the spinning earphones use this).
Real-world video (filmed with a camcorder, not just a screencast) of any length probably wouldn't work well at all with this compression scheme, though. Apple's JS video compression basically only optimizes out parts of the image that don't change at all. If you're filming the "real world", pretty much every 8x8 block in each frame is going to have changes (it's basically guaranteed due to sensor noise, lightbulbs flickering, etc).
To make this anywhere near possible on a large scale would be feasible though. The easiest way would to be create the manifest while encoding the video. Take the same compression scheme but save the encoding diffs for every frame.
Seems like a terrible use of time.