Autogenerating a E-book Collection From Three Years of iMessages

I’m steadily aggravated on the issues that I can’t keep in mind. And after I’m attempting to recollect the main points of one thing, I usually flip to my textual content messages—because of large enhancements not too long ago, it’s now fairly quick to look my complete iMessage historical past on my telephone, offered that I can keep in mind some verbatim a part of the message I’m on the lookout for. And sometimes, as soon as I’m prior to now, I need to go searching: textual content messages from ages in the past present surprisingly fascinating insights into the previous.
However iMessage isn’t arrange nicely for this informal shopping: once you attempt to scroll away from a search end result, the loading may be very gradual. And the interface gives no strategy to leap to a selected date. I’d actually like to have the ability to “flip by means of” my messages and cease at a random place for a view into that second in time. Apple doesn’t present a manner to try this, so, I assumed, why not allow it myself? I although it’d be nice to allow this “flipping by means of messages” in essentially the most literal manner attainable: by making a bodily ebook of my largest dialog.
As a way to do something in any respect with the messages, I wanted to get them out of my telephone and onto my laptop. I’d regarded many times for a manner to do that with Sign, so wasn’t positive what I’d discover, however was happy that it appeared comparatively easy to drag messages off an iPhone (even simpler in case your messages are already on a Mac). In keeping with the very useful iPhone wiki, all I needed to do was seize sms.db
from a backup of my telephone, and I’d have a SQLite database that I may do no matter I favored with.
This simplicity appeared a bit too good to be true—for some purpose I anticipated some proprietary format that might be a ache to reverse-engineer. So I needed to see it for myself. I took a typical backup on my Mac in finder (that was a visit—the “plugged-in iPhone” UI has barely modified since I used iTunes to sync music to my iPod contact in seventh grade). Whereas the backup format is admittedly not difficult, it was intimidating shopping the backup folder at first as a result of an ls
within the root listing yields a bunch of directories named after a single hex byte:
/.../00008120-001854410CEB401E >>> ls
00 0e 1c 2a 38 46 54 62 70 7e 8c 9a a8 b6 c4 d2 e0 ee fc
01 0f 1d 2b 39 47 55 63 71 7f 8d 9b a9 b7 c5 d3 e1 ef fd
02 10 1e 2c 3a 48 56 64 72 80 8e 9c aa b8 c6 d4 e2 f0 fe
03 11 1f second 3b 49 57 65 73 81 8f 9d ab b9 c7 d5 e3 f1 ff
04 12 20 2e 3c 4a 58 66 74 82 90 9e ac ba c8 d6 e4 f2 Information.plist
05 13 21 2f 3d 4b 59 67 75 83 91 9f advert bb c9 d7 e5 f3 Manifest.db
06 14 22 30 3e 4c 5a 68 76 84 92 a0 ae bc ca d8 e6 f4 Manifest.db-shm
07 15 23 31 3f 4d 5b 69 77 85 93 a1 af bd cb d9 e7 f5 Manifest.db-wal
08 16 24 32 40 4e 5c 6a 78 86 94 a2 b0 be cc da e8 f6 Manifest.plist
09 17 25 33 41 4f 5d 6b 79 87 95 a3 b1 bf cd db e9 f7 Standing.plist
0a 18 26 34 42 50 5e 6c 7a 88 96 a4 b2 c0 ce dc ea f8
0b 19 27 35 43 51 5f 6d 7b 89 97 a5 b3 c1 cf dd eb f9
0c 1a 28 36 44 52 60 6e 7c 8a 98 a6 b4 c2 d0 de ec fa
0d 1b 29 37 45 53 61 6f 7d 8b 99 a7 b5 c3 d1 df ed fb
Getting into one in all these directories yields a bunch of recordsdata beginning with the hex byte after which the listing was named:
/.../00008120-001854410CEB401E >>> cd 3d
/.../00008120-001854410CEB401E/3d >>> ls
3d0292d3fe90e1e22c247403c0e9105ea0f9ff44 3d8830b71e98aae80b6eaf8bdd5500d79ce74946
3d02fe309afa7de839822d6f1b8433aa90090d17 3d88cdc16ff2b5231e5ea4b52271ee195a6f4b96
3d072c4fca5db4a5678fa10b137435f757e98492 3d8a425d70f4049417e855d273c44d8199de30c9
3d0739c90579fa907246d5c21bd8d8ebaa2d9d6b 3d8a43a1921f504bb4393250f75b24bfc2c5cedb
3d0798b3cc4d2f5ad347ffb8bc5a0f9d8c82cfb9 3d8a7c0460aadabf1b7fc9adea9e6a2a6e7bc73b
3d07a0adc5c5c22dc525ccd3a93fb05a50ef1ac5 3d8b6ad12c7617b3d783790a457b0aa19b193b68
3d0880f091c51ddc145e17c78d8e6f9a3e7e20c8 3d8b82abe05a9d697102d8b665c9d499e07492ea
3d093e92cf03abf3650411e09a647630a1e0c478 3d8ba897240ad32580bf8dfd00db8f181658cdfd
3d095e908ff898be3b3ffd64a75db959a58ac70a 3d8bc227d67ec4944df8e75291102367034d7214
3d09d5dcd5a9bdad67a80cd83201a9e1fb75aada 3d8c722f1d92f7cd6f90c936c14f60f51aad128b
3d0abb83123be82abf43ce20118e72fea06023c5 3d8ca6eeabeb1c01fae05bb20f08dedf734cfd04
3d0b246304c42d2ab1eb1892d629fcdfde689cb7 3d8d0c6b1bf7946c6bef91d60cccb32207b7bc01
3d0bb5f49e6f0e31348ef8feb9a38d4ce71f5ec7 3d8fd2fbcaf3079a683a8e486ecde8875f0a591d
3d0c1283936c45fec533a507b78558b5aa3159fa 3d8ff93bd94b3ea14edc77d1e677cf4ee4306e4e
3d0cb8e28462780bb9af1440e297ecd8224c70ff 3d90ea8bfbf62feda080cd0ccbd12fa5c8673993
3d0ce10de5f69606c52882215b99ebab259dc194 3d932638fe8ed669725b7a143c6a8b02b8959923
3d0d7e5fb2ce288813306e4d4636395e047a3d28 3d93c92679aa9d398331e27fdeed64b5094e68d1
...
Taking a look at these with a pleasant file explorer that appears at magic bytes to find out filetypes (I exploit Thunar) helps make some sense of it, since it might probably present that these cryptic names actually are simply common outdated pictures and different recordsdata. However actually even that’s pointless for the reason that iPhone Wiki advised us that the filename for the sms.db
file that we’re on the lookout for is 3d0d7e5fb2ce288813306e4d4636395e047a3d28
. Copying this to my dwelling listing:
$ cp 3d0d7e5fb2ce288813306e4d4636395e047a3d28 ~/imessage.db
And opening it up with the sqlite3
CLI we are able to truly see some tables!
~ >>> sqlite3 imessages.db
SQLite model 3.44.2 2023-11-24 11:41:44
Enter ".assist" for utilization hints.
sqlite> .tables
_SqliteDatabaseProperties message
attachment message_attachment_join
chat message_processing_task
chat_handle_join recoverable_message_part
chat_message_join sync_deleted_attachments
chat_recoverable_message_join sync_deleted_chats
deleted_messages sync_deleted_messages
deal with unsynced_removed_recoverable_messages
kvtable
sqlite>
The schema requires a few joins to extract an precise dialog, however with out an excessive amount of bother we are able to begin to pull out messages (on this case from CVS spamming me):
sqlite> choose
message.ROWID, message.date, message.textual content, message.is_from_me from message
inside be part of chat_message_join on message_id=message.ROWID
inside be part of chat on chat.ROWID=chat_message_join.chat_id
the place chat.chat_identifier='28732'
order by date asc;
278125|694030292385607040||0
278327|694647875648848000||0
...
314056|726702453329793024||0
314412|727316171079934976|CVS ExtraCare: 20% off one full-price merchandise, simply because. Faucet the hyperlink to ship to card: c.cvs.com/B0kjBMbNM|0
We acquired one, however a number of clean ones too—lots of the messages are lacking! It seems that for some messages, message information is saved in an encoded NSMutableAttributedString
binary blob within the message.attributedData
column as an alternative of in message.textual content
. With a little bit of wrangling to get the binary information out of the SQLite CLI, we are able to have a look at one in all these lacking messages and see that the information is certainly there:
~ >>> sqlite3 imessages.db "choose hex(attributedBody) from message the place ROWID=278125;"
| minimize -d' -f2
| xxd -r -p
| xxd -g1
00000000: 04 0b 73 74 72 65 61 6d 74 79 70 65 64 81 e8 03 ..streamtyped...
00000010: 84 01 40 84 84 84 19 4e 53 4d 75 74 61 62 6c 65 [email protected]
00000020: 41 74 74 72 69 62 75 74 65 64 53 74 72 69 6e 67 AttributedString
00000030: 00 84 84 12 4e 53 41 74 74 72 69 62 75 74 65 64 ....NSAttributed
00000040: 53 74 72 69 6e 67 00 84 84 08 4e 53 4f 62 6a 65 String....NSObje
00000050: 63 74 00 85 92 84 84 84 0f 4e 53 4d 75 74 61 62 ct.......NSMutab
00000060: 6c 65 53 74 72 69 6e 67 01 84 84 08 4e 53 53 74 leString....NSSt
00000070: 72 69 6e 67 01 95 84 01 2b 81 f3 00 43 56 53 20 ring....+...CVS
00000080: 45 78 74 72 61 43 61 72 65 3a 20 24 32 20 6f 66 ExtraCare: $2 of
00000090: 66 20 79 6f 75 72 20 70 75 72 63 68 61 73 65 2c f your buy,
000000a0: 20 6a 75 73 74 20 66 6f 72 20 79 6f 75 21 20 49 only for you! I
000000b0: 6e 20 73 74 6f 72 65 20 6f 72 20 6f 6e 6c 69 6e n retailer or onlin
000000c0: 65 2e 20 54 61 70 20 74 68 65 20 6c 69 6e 6b 20 e. Faucet the hyperlink
000000d0: 74 6f 20 73 65 6e 64 20 64 65 61 6c 20 74 6f 20 to ship deal to
000000e0: 63 61 72 64 3a 20 63 2e 63 76 73 2e 63 6f 6d 2f card: c.cvs.com/
Fortunately, we don’t have to implement the parsing for this binary format ourselves. There’s an important imessage-database
crate that does precisely this: ingests an iMessage database and outputs the information in good Rust information buildings. Out of the field, it comes with a binary (imessage-exporter
) to generate textual content or HTML variations of your conversations—so actually fairly just like my purpose.
With simply a few tweaks to the SQL assertion the library makes use of to fetch messages, I’m in a position to slim down the question to only a single dialog. However for this undertaking I need to make a properly formatted bodily ebook that I can maintain in my hand and flip by means of—the HTML and textual content codecs that the undertaking ships with received’t fairly work for this.
I’m an enormous fan of LaTeX because of the lovely paperwork it may be satisfied to provide, and since leaving faculty have been itching to generate some extra fairly PDFs. And since LaTeX’s text-based supply code makes it good for templating and autogeneration, it looks like an important selection. I’ll my ebook by spitting out LaTeX code for each textual content message within the dialog.
Because of the imessage-database
library it’s fairly straightforward to iterate by means of all of the messages within the dialog, so I begin by producing LaTeX code for every message. My first method at this LaTeX technology is sort of easy: align left if the message is from me and proper in any other case, insert some textual content indicating an attachment the place pictures are despatched, and skip issues like reactions and replies that I don’t need to trouble rendering. This preliminary method works nicely, and after splitting the textual content up into chapters primarily based on date and little bit of visible tweaking, I’m happy.
However there’s one main downside: LaTeX doesn’t assist unicode. After all, which means that as quickly as I prolong the rendering window sufficient to incorporate an emoji, the LaTeX compiler explodes. Merely stripping out emojis from the supply textual content works, however is hardly a tolerable resolution—in any case, emojis are integral to trendy communication.
After a little bit of analysis, it appears like XeLaTeX is the important thing: it provides assist for unicode fonts to LaTeX. Switching to XeLaTeX proves fairly easy, and by defining a emojifont
to an emoji font and wrapping each emoji in {emojifont X}
in my generated LaTeX supply, the output renders efficiently with emojis inline. However I don’t need to pay for each web page of my ebook to be printed in coloration after I print it. Fortunately, Google’s Noto Emoji font has an important set of easy black-and-white emojis which might be good for this objective. I’m fairly proud of the best way these emojis look in print:
After a pair further niceties like a header that tracks the present date (with a LaTeX command that units markright
with each message), I’m able to put all of it collectively.
After I lastly compile all three years of messages that I need to have the ability to flip by means of, I’m stunned to seek out that the compiler dumps out nicely over a thousand pages of messages after I put them into a typical 6″ x 9″ web page dimension. Because it’s precisely three years of messages anyway, although, there’s a straightforward resolution: I break up the opus into three volumes to get the dimensions of every one right down to one thing printable.
After I determined to attempt to do that, I actually wished to finish up with a bodily ebook in my hand. So I had to determine learn how to get these books printed. And to my shock, printing a paperback ebook is sort of low cost. After reviewing a bunch of choices, Barnes and Noble Press looks like the most suitable choice. It’s decently costlier than a number of the different choices like Lulu and Amazon KDP, however most choices are focused at folks which might be attempting to promote their books. B&N Press is just too, however their story for private books appears higher than the others as you don’t have to “publish” your ebook to get it printed. And the value remains to be fairly affordable: I used to be in a position to print all three volumes, round 1300 pages complete, for $30 together with transport.
Earlier than I can order books from my LaTeX-generated PDFs, the web site tells me that the final step is to create covers. Upon importing the physique pages to B&N Press, the websites generates the scale required for the quilt. Given these, I threw collectively a canopy for every of the three volumes in Inkscape, which the web site accepted with out criticism.
The B&N press web site just isn’t good: it typically may be very gradual, and whereas attempting to position my order the checkout web page was damaged and wouldn’t present up for over 24 hours. However after that was fastened, ordering labored.
And positive sufficient, after a pair weeks’ wait, I had three precise books in hand. I flip by means of them usually, and it’s so a lot simpler to revisit outdated conversations this fashion than attempting to take action on my telephone.
The supply code is in tough form, and I haven’t packaged it as a cargo binary, however there’s not a lot of it. If you wish to have a look or attempt for your self, it’s out there at https://github.com/bkettle/message-book.