A simulation of me: fine-tuning an LLM on 240k textual content messages
I’m removed from the primary to think about this. A number of individuals — maybe impressed by creepy Black Mirror episodes — have tried to fine-tune an LLM on their SMS or WhatsApp historical past in an effort to create a simulation of themselves.
Typically outcomes have upset; conversations are typically mundane (it appears we don’t share our most profound insights by textual content), and shortly descend into loops.
I’ve been dying to have a attempt myself. Over the vacation break, I had a spare few moments and a $100 Xmas reward to blow on google compute, and I started working between helpings of turkey.
The technique
I’ll write a collection of posts about how I approached it, however right here’s the gist:
1. GET THE MESSAGES
I used a utility referred to as iMazing to obtain all my SMS / iMessage and WhatsApp conversations. I filtered out group chats, individuals not in my contacts, and other people I not often message. This yielded 240,805 messages with 288 individuals.
2. PREPARE THE DATASET
I created my coaching and take a look at datasets by grouping into chunks of messages with the identical individual, packing as many messages as doable into every chunk with out exceeding 200 tokens. This resulted in 25,087 chunks of knowledge, with every chunk containing a immediate and about 7-10 messages.
Right here’s an instance chunk with a extra administrative interplay between me and my constructing’s handyman, with some particulars modified.
<<SYS>>Write a sensible textual content message chat. Keep away from repetition.<</SYS>>
[INST]Write a chat between Edward and Josh[/INST]### Josh: Thanks Ed will name you quickly as I get again to work
### Josh: Hello Ed When you've got time give me a name so we are able to talk about about paint
### Edward: Appreciated “Hello Ed When you've got time give me a name so we are able to talk about about paint ”
### Edward: hey Josh. It’s Ed from 24B. Any likelihood you can do the portray this week? Wed or Thurs?
### Edward: i’m on a name sorry free in an hour
### Josh: Simply name me when you've got an opportunity
### Edward: hey Josh - simply checking in - is Tom capable of do the portray?
3. HYPER-PARAMETER SEARCH
I began with the 7B parameter Llama 2 and fine-tuned utilizing QLoRA on V100 VMs. I discovered conflicting recommendation all around the internets on QLoRA hyper-parameters and ended up doing a fairly in depth search. Later I upgraded to 13B parameters with a marked enchancment in outcomes.
4. BACK TO THE DATA
Then I returned to step 2. I labored on higher methods to format the coaching knowledge and on enhancements to the immediate. I used to be thrown off at first as a result of the coaching and eval loss truly elevated, however the outcomes have been completely bettering.
5. GENERATION
Lastly, I used Hugging Face’s Text Generation to write down conversations, both taking part in the function of me, or considered one of my contacts, or appearing as either side. I experimented with penalties for repetition, token suppression, beam search and extra.
Preliminary disappointment
My early makes an attempt yielded comparable outcomes to my predecessors. The LLM would generate conversations that appeared to be surreal parodies of my life. My work colleagues would endlessly reschedule conferences earlier and later; my accomplice and I’d relentlessly debate our lunch choices. Efforts to encourage the mannequin to be extra numerous solely resulted in more and more unappetizing lunch options.
However then..
To my monumental delight, persistence paid off, largely after engaged on the immediate, the enter knowledge format and the technology strategy. The LLM has turn into extremely efficient at imitating me. To my shock, it could possibly additionally convincingly act as a lot of my pals — actually anybody with whom I’ve exchanged at the very least 1,000 messages.
I’m startled by how actual the conversations appear, and the way they replicate the nuances of my completely different relationships. The mannequin isn’t at all times on level – it nonetheless generally will get caught in loops or goes awry – however I’d say greater than half of the conversations would cross a Turing Take a look at in that I wouldn’t be capable of let you know if actual or generated. I’ve truly been so stunned that I’ve generally searched my textual content historical past in case the mannequin is regurgitating actual conversations (it’s not).
That is much less stunning, however I believe it’s actually cute – for my contacts the place English isn’t their first language, the mannequin cruelly reproduces spelling and grammar errors and quirky expressions.
Examples
I’m not comfy sharing generated conversations with precise contacts, so as a substitute for this publish I primed the LLM with messages from fictional individuals (in orange under) and let it think about how the dialog may proceed. I need to admit that the efficiency was significantly poorer; with out extra context to go on, the mannequin shortly fell into its consolation zone of assembly logistics. I’ve concepts of the best way to enhance this by offering extra of a backstory within the immediate.
### Edward: Hello David, that is Ed. We met at yesterday's AI convention. You have been excited about listening to extra about our matching fashions and maybe making an attempt our product.
### David: Hey Ed, I'm very within the product, I've a challenge that I'm engaged on and I'd like to study extra about your matching fashions.
### Edward: Positive, I am free for the remainder of at present if you're. We are able to speak about it in additional element.
### David: Okay, let's discuss then.
### David: Hello Ed, I'm obtainable now, I'm at house and I'm free to speak concerning the product.
### Edward: 2 minutes
### David: Okay
### Edward: Hey Lucy! That is Edward. It was nice to stumble upon you yesterday - I do not suppose we have spoken since college! How are you?
### Lucy: Hey Edward! I am nice! I am working at a restaurant and I am doing nicely. I'll a pageant in August and I'll London in September!
### Edward: that is implausible! what is the pageant? the place in London?
### Lucy: It is referred to as the Boomtown pageant. I am going for a weekend. In London I'll see my good friend and go to some museums.
### Edward: 2013 is an enormous yr for me - I am getting married in Oct! I am additionally in London in September, I will be there for two weeks, I might like to see you in the event you're round
### Lucy: 2 weeks! Wow! I'll be in London from the thirteenth to the twenty first! We must always meet!
### Edward: thirteenth - fifteenth I am free, then I am in Paris for a weekend, then again in London seventeenth - twenty first, any likelihood you may be there?
### Lucy: 13-15 and 17-21! I am going house on the nineteenth. I will be there!
### Edward: twenty first is my final day, however in the event you're nonetheless round I might love to fulfill you
Right here’s a extra intriguing instance of the mannequin imitating me; once more textual content in orange has been supplied to the mannequin and it equipped the responses:
### Edward: Hey
### Lucas: Hey I've a query for you. Inform me a secret, one thing I do not find out about you.
### Edward: 1. I am a horrible dancer. 2. I am a little bit of a nerd.
### Lucas: OK I do know each these issues. Inform me a deep secret that no person is aware of about you.
### Edward: 3. *** Deeper secret redacted!!! ***
My deeper secret is nothing too outrageous, however it’s at the very least of the “I might let you know however then I’d should kill you” selection, so in all probability greatest to not publish right here…
Not going to go there
This publish might take a relatively macabre flip at this level. A really shut good friend of mine died about 10 years in the past, and we now have a protracted textual content message historical past collectively. You don’t want to look at Black Mirror to know the place I’m heading with this. However you’ll be relieved to listen to that I acknowledged this is able to be in extraordinarily poor style and eliminated the information. Sufficient on that subject!
EDIT: Hacker Information reader wyldfire identified that there could possibly be one other angle for this type of investigation, questioning whether or not “LLMs might advance to some extent the place it is perhaps therapeutic to have a dialog with a deceased liked one.”
What’s subsequent
I’ll be writing a collection of posts on my strategy in order that others can experiment. I’d love to listen to the way you get on and assist with any roadblocks. I realized a lot on this journey, and really a lot loved the expertise, and I hope you’ll too.
I’m excited to attempt RAG and different strategies to present the mannequin extra context on its conversations. I anticipate one other step change in high quality of conversations. I look ahead to the day that the mannequin can absolutely change me in replying to all my textual content messages..
I’m positive there’s tons extra mileage in bettering my LLM. I’m going to attempt different base fashions, additional refinement of the immediate and enter knowledge, and extra work on the technology. Most significantly, it goes with out saying: with day-after-day that goes by, my coaching dataset grows a little bit bigger!