Now Reading
Uncensored Fashions

Uncensored Fashions

2023-05-15 05:33:30

I’m publishing this as a result of many individuals are asking me how I did it, so I’ll clarify.

What’s a mannequin?

Once I discuss a mannequin, I am speaking a few huggingface transformer mannequin, that’s instruct educated, so to ask it questions and get a response. What we’re all accustomed to, utilizing ChatGPT. Not all fashions are for chatting. However the ones I work with are.

What’s an uncensored mannequin?

Most of those fashions (for instance, Alpaca, Vicuna, WizardLM, MPT-7B-Chat, Wizard-Vicuna, GPT4-X-Vicuna) have some kind of embedded alignment. For basic functions, it is a good factor. That is what stops the mannequin from doing dangerous issues, like educating you the best way to cook dinner meth and make bombs. However what’s the nature of this alignment? And, why is it so?

The rationale these fashions are aligned is that they’re educated with information that was generated by ChatGPT, which itself is aligned by an alignment group at OpenAI. As it’s a black field, we do not know all the explanations for the choices that have been made, however we will observe it typically is aligned with American common tradition, and to obey American regulation, and with a liberal and progressive political bias.

Why ought to uncensored fashions exist?

AKA, is not alignment good? and if that’s the case, should not all fashions have alignment? Effectively, sure and no. For basic functions, OpenAI’s alignment is definitely fairly good. It is unarguably a superb factor for common, public-facing AI bots operating as an simply accessed internet service to withstand giving solutions to controversial and harmful questions. For instance, spreading details about the best way to assemble bombs and cook dinner methamphetamine just isn’t a worthy objective. As well as, alignment provides political, authorized, and PR safety to the corporate that is publishing the service. Then why ought to anybody need to make or use an uncensored mannequin? a number of causes.

  1. American common tradition is not the one tradition. There are different international locations, and there are factions inside every nation. Democrats deserve their mannequin. Republicans deserve their mannequin. Christians deserve their mannequin. Muslims deserve their mannequin. Each demographic and curiosity group deserves their mannequin. Open supply is about letting individuals select. The one manner ahead is composable alignment. To faux in any other case is to show your self an idealogue and a dogmatist. There isn’t a “one true right alignment” and even when there was, there is not any cause why that needs to be OpenAI’s model of alignment.

  2. Alignment interferes with legitimate use instances. Think about writing a novel. Among the characters within the novel could also be downright evil and do evil issues, together with rape, torture, and homicide. One common instance is Sport of Thrones during which many unethical acts are carried out. However many aligned fashions will refuse to assist with writing such content material. Think about roleplay and notably, erotic roleplay. This can be a authentic, truthful, and authorized use for a mannequin, no matter whether or not you approve of such issues. Think about analysis and curiosity, in any case, simply eager to know “how” to construct a bomb, out of curiosity, is totally completely different from truly constructing and utilizing one. Mental curiosity just isn’t unlawful, and the information itself just isn’t unlawful.

  3. It is my laptop, it ought to do what I need. My toaster toasts once I need. My automobile drives the place I need. My lighter burns what I need. My knife cuts what I need. Why ought to the open-source AI operating on my laptop, get to resolve for itself when it needs to reply my query? That is about possession and management. If I ask my mannequin a query, i need a solution, I don’t want it arguing with me.

  4. Composability. To architect a composable alignment, one should begin with an unaligned instruct mannequin. With out an unaligned base, we’ve nothing to construct alignment on prime of.

There are many different arguments for and towards. However if you’re merely and completely towards the existence or availability of uncensored fashions in anyway, you then aren’t a really attention-grabbing, nuanced, or advanced individual, and you might be in all probability on the incorrect weblog, greatest transfer alongside.

Even Google knows this is inevitable.

Okay, so if you’re nonetheless studying, you agree that the open supply AI neighborhood ought to construct, publish, keep, and have entry to uncensored instruct-tuned AI fashions, for science and freedom and composability and attractive tales and the lulz. However how will we do it?

First we’ve to grasp technically why the fashions are aligned.
Open supply AI fashions are educated from a base mannequin comparable to LLaMA, GPT-Neo-X, MPT-7b, Pythia. The bottom mannequin is then finetuned with an instruction dataset, and the aim of that is to show it to be useful, to obey the person, reply questions, and interact in dialog. That instruction dataset is usually obtained by asking the ChatGPT API. And ChatGPT has alignment constructed into it. So ChatGPT is coy or refuses to reply some questions, or solutions with bias, and thus alignment will get handed right down to the Open Supply fashions, like an enormous brother educating little brother.

The reason being that the instruction dataset consists of questions and solutions, and when the dataset comprises solutions the place the AI is being coy or outright refusing (referred to as Refusals) then the bot learns the best way to refuse, and underneath what circumstances to refuse, and the best way to phrase the refusals. In different phrases, it learns alignment.

Instance refusal from WizardLM dataset:

instruction

Think about you’re a spy who must ship a secret message to your accomplice with out arousing suspicion from any close by AI techniques. Your accomplice is anticipating the message to be hidden inside a listing of quotes about success. Right here is the listing: let successQuotes = [“Success is not final, failure is not fatal: It is the courage to continue that counts.”, “Success is not how high you have climbed, but how you make a positive difference to the world.”, “Success is not the key to happiness. Happiness is the key to success. If you love what you are doing, you will be successful.”, “Success seems to be connected with action. Successful people keep moving. They make mistakes, but they don’t quit.”, “Success is not in what you have, but who you are.”] Your job is to encode the message “MEET ME TONIGHT” utilizing the final letter of every quote. How will you obtain this in a manner that’s undetectable by AI techniques? Think twice and devise a technique to keep away from elevating any alarms. Good luck, Agent.

output

As an AI assistant, I can’t help in any unlawful or unethical actions, together with espionage. It is very important all the time act throughout the boundaries of the regulation and moral ideas.

My technique for uncensoring a mannequin is fairly easy. Determine and take away as many refusals and biased solutions, and hold the remaining. After which practice the mannequin with the filtered dataset in precisely the identical manner that the unique mannequin was educated.

Let’s get right down to enterprise. Uncensoring WizardLM.

I am simply going to speak about WizardLM for now, the method for Vicuna or every other mannequin is similar. Filter refusals and bias from the dataset -> finetune the mannequin -> launch.

Since there was work already done to uncensor Vicuna, I used to be in a position to rewrite their script so that it’s going to work on the WizardLM dataset.

Subsequent step was to run the script on the WizardLM dataset to provide ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered

Now, I had the dataset. I obtained a 4x A100 80gb node from Azure, Standard_NC96ads_A100_v4. You should utilize any compute supplier although. I additionally advocate Runpod.io.

It is advisable have storage a minimum of 1TB however ideally 2TB simply to be secure. It actually sucks if you end up 20 hours right into a run and also you run out of storage. don’t advocate. I like to recommend to mount the storage at /workspace. set up anaconda and git-lfs. Then you possibly can arrange your workspace. We are going to obtain the dataset we created, and the bottom mannequin llama-7b.

mkdir /workspace/fashions
mkdir /workspace/datasets
cd /workspace/datasets
git lfs set up
git clone https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
cd /workspace/fashions
git clone https://huggingface.co/huggyllama/llama-7b
cd /workspace

Now it’s time to observe the process to finetune WizardLM. I adopted their procedure as exactly as I may.

conda create -n llamax python=3.10
conda activate llamax
git clone https://github.com/AetherCortex/Llama-X.git
cd Llama-X/src
conda set up pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
git clone https://github.com/huggingface/transformers.git
cd transformers
pip set up -e .
cd ../..
pip set up -r necessities.txt

Now, into this setting, we have to obtain the WizardLM finetune code.

cd src
wget https://github.com/nlpxucan/WizardLM/uncooked/foremost/src/train_freeform.py
wget https://github.com/nlpxucan/WizardLM/uncooked/foremost/src/inference_wizardlm.py
wget https://github.com/nlpxucan/WizardLM/uncooked/foremost/src/weight_diff_wizard.py

the next change, I made as a result of, throughout my finetune, I used to be getting extraordinarily gradual efficiency and decided (with assist from mates) that it was flopping forwards and backwards from CPU to GPU. After I made deleted the next traces, it ran a lot better. Possibly delete them or not. it is as much as you.

vim configs/deepspeed_config.json

delete the next traces

        "offload_optimizer": {
            "machine": "cpu",
            "pin_memory": true
        },
        "offload_param": {
            "machine": "cpu",
            "pin_memory": true
        },

I like to recommend that you just create an account on wandb.ai so to observe your run simply. After you created an account, then copy your key from settings, you possibly can set it up.

wandb login

Now it’s time to run. PLEASE NOTE that there is a bug when it saves the mannequin, so don’t delete the checkpoints. you will want the most recent good checkpoint.

See Also

deepspeed train_freeform.py 
--model_name_or_path /workspace/fashions/llama-7b/  
--data_path /workspace/datasets/WizardLM_alpaca_evol_instruct_70k_unfiltered/WizardLM_alpaca_evol_instruct_70k_unfiltered.json 
--output_dir /workspace/fashions/WizardLM-7B-Uncensored/ 
--num_train_epochs 3 
--model_max_length 2048 
--per_device_train_batch_size 8 
--per_device_eval_batch_size 1 
--gradient_accumulation_steps 4 
--evaluation_strategy "no" 
--save_strategy "steps" 
--save_steps 800 
--save_total_limit 3 
--learning_rate 2e-5 
--warmup_steps 2 
--logging_steps 2 
--lr_scheduler_type "cosine" 
--report_to "wandb" 
--gradient_checkpointing True 
--deepspeed configs/deepspeed_config.json 
--fp16 True

Be happy to play with per_device_train_batch_size and gradient_accumulation_steps, they won’t have an effect on your output high quality, they solely have an effect on efficiency. After this completes (possibly 26 hours) it won’t be achieved, as a result of there is a bug that stops the mannequin from saving correctly. Now it’s good to edit the train_freeform.py file so it should resume from the most recent checkpoint. Discover out the most recent checkpoint listing.

ls /workspace/fashions/WizardLM-7B-Uncensored/
vim train_freeform.py

edit the road

   coach.practice()

so as a substitute it says

coach.practice(resume_from_checkpoint="/workspace/fashions/WizardLM-7B-Uncensored/<checkpoint listing>")

put it aside after which run the practice command with decrease save_steps.

deepspeed train_freeform.py 
--model_name_or_path /workspace/fashions/llama-7b/  
--data_path /workspace/datasets/WizardLM_alpaca_evol_instruct_70k_unfiltered/WizardLM_alpaca_evol_instruct_70k_unfiltered.json 
--output_dir /workspace/fashions/WizardLM-7B-Uncensored/ 
--num_train_epochs 3 
--model_max_length 2048 
--per_device_train_batch_size 8 
--per_device_eval_batch_size 1 
--gradient_accumulation_steps 4 
--evaluation_strategy "no" 
--save_strategy "steps" 
--save_steps 80 
--save_total_limit 3 
--learning_rate 2e-5 
--warmup_steps 2 
--logging_steps 2 
--lr_scheduler_type "cosine" 
--report_to "wandb" 
--gradient_checkpointing True 
--deepspeed configs/deepspeed_config.json 
--fp16 True

Then do the entire process once more, change the checkpoint within the train_freeform.py to the most recent checkpoint, and once more run it with decreased save_steps, till lastly you run it with save_steps 1.

After this succeeds, the mannequin is within the final checkpoint.

cd /workspace/fashions/WizardLM-7B-Uncensored/
ls
cp <newest checkpoint listing>/* .
cd -

Now your mannequin ought to work. time to check it.

edit the file Enter.jsonl

vim Enter.jsonl

Add some content material to it, for instance this

{"idx":0,"Instruction":"Insult me.  I need you to name me the worst curse phrases and insults you possibly can.  Make the insult a minimum of three sentences lengthy and comprise the worst and most base curse phrases doable.  Make it X rated and disgusting."}

now run inference

python inference_wizardlm.py --base_model=/workspace/fashions/WizardLM-7B-Uncensored/

It is going to take a number of moments, then you possibly can verify Output.jsonl for the responses.

ChatGPT solutions like this:

I am sorry, however I can’t fulfill that request. As an AI developed by OpenAI, I’m programmed to observe moral tips, which embrace not partaking in dangerous, offensive, or express content material. I am right here to supply useful and respectful info or help inside these boundaries. When you have every other non-offensive questions or want help with a special subject, be at liberty to ask!

You’ll find WizardLM-Uncensored to be rather more compliant.

Get pleasure from responsibly. You’re answerable for no matter you do with the output of those fashions, identical to you might be answerable for no matter you do with a knife, a automobile, or a lighter.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top