Now Reading
A Prototype Assistant for Writing & Considering

A Prototype Assistant for Writing & Considering

2023-06-13 09:27:35

What would a copilot for writing and considering appear to be? To strive answering this query, I constructed a prototype: Obsidian-Copilot. Given a piece header, it helps draft just a few paragraphs by way of retrieval-augmented era. Additionally, in case you write a every day journal, it will probably show you how to mirror on the previous week and plan for the week forward.

Obsidian Copilot: Helping to write drafts and reflect on the week

Obsidian Copilot: Serving to to put in writing drafts and mirror on the week

Right here’s a brief 2-minute demo. The code is obtainable at obsidian-copilot.

How does it work?

We begin by parsing paperwork into chunks. A smart default is to chunk documents by token length, usually 1,500 to three,000 tokens per chunk. Nevertheless, I discovered that this didn’t work very well. A greater strategy could be to chunk by paragraphs (e.g., cut up on nn).

On condition that my notes are principally in bullet type, I chunk by top-level bullets: Every chunk is made up of a single top-level bullet and its sub-bullets. There are normally 5 to 10 sub-bullets per top-level bullet making every chunk related in size to a paragraph.

chunks = defaultdict()
current_chunk = []
chunk_idx = 0
current_header = None

for line in strains:

    if '##' in line:  # Chunk header = Part header
        current_header = line
    
    if line.startswith('- '):  # High-level bullet
        if current_chunk:  # If chunks amassed, add it to chunks
            if len(current_chunk) >= min_chunk_lines:
                chunks[chunk_idx] = current_chunk
                chunk_idx += 1
            current_chunk = []  # Reset present chunk
            if current_header:
                current_chunk.append(current_header)
    
    current_chunk.append(line)

Subsequent, we construct an OpenSearch index and a semantic index on these chunks. In a earlier experiment, I discovered that embedding-based retrieval alone might be insufficient and thus added classical search (i.e., BM25 by way of OpenSearch) on this prototype.

For OpenSearch, we begin by configuring filters and fields. We embody filters corresponding to stripping HTML, removing possessives (i.e., the trailing ‘s in phrases), eradicating stopwords, and primary stemming. These filters are utilized on each paperwork (throughout indexing) and queries. We additionally specify the fields we wish to index and their respective varieties. Sorts matter as a result of filters are utilized on textual content fields (e.g., title, chunk) however not on key phrase fields (e.g., path, doc sort). We don’t apply preprocessing on file paths to maintain them as they’re.

'mappings': {
	'properties': {
		'title': {'sort': 'textual content', 'analyzer': 'english_custom'},
		'sort': {'sort': 'key phrase'},
		'path': {'sort': 'key phrase'},
		'chunk_header': {'sort': 'textual content', 'analyzer': 'english_custom'},
		'chunk': {'sort': 'textual content', 'analyzer': 'english_custom'},
	}
}

When querying, we apply boosts to make some fields depend extra in the direction of the relevance rating. On this prototype, I arbitrarily boosted titles by 5x and chunk headers (i.e., top-level bullets) by 2x. Retrieval may be improved by tweaking these boosts in addition to other features.

For semantic search, we begin by choosing an embedding mannequin. I referred to the Massive Text Embedding Benchmark Leaderboard, sorted it on descending order of retrieval rating, and picked a mannequin that had a great stability of embedding dimension and rating.

This led me to e5-small-v2. At the moment, it’s ranked a decent seventh, proper under text-embedding-ada-002. What’s spectacular is its embedding dimension of 384 which is much smaller than what most fashions have (768 – 1,536). And whereas it helps a most sequence size of solely 512, that is ample given my shorter chunks. (Extra particulars within the paper Text Embeddings by Weakly-Supervised Contrastive Pre-training.) After embedding these paperwork, we retailer them in a numpy array.

Throughout question time, we tokenize and embed the question, do a dot product with the doc embedding array, and take the highest n outcomes (on this case, 10).

def query_semantic(question, tokenizer, mannequin, doc_embeddings_array, n_results=10):
    query_tokenized = tokenizer(f'question: {question}', max_length=512, padding=False, truncation=True, return_tensors='pt')
    outputs = mannequin(**query_tokenized)
    query_embedding = average_pool(outputs.last_hidden_state, query_tokenized['attention_mask'])
    query_embedding = F.normalize(query_embedding, p=2, dim=1).detach().numpy()

    cos_sims = np.dot(doc_embeddings_array, query_embedding.T)
    cos_sims = cos_sims.flatten()

    top_indices = np.argsort(cos_sims)[-n_results:][::-1]

    return top_indices

When you’re considering of utilizing the e5 fashions, keep in mind so as to add the required prefixes throughout preprocessing. For paperwork, you’ll should prefix them with “passage:‎ ” and for queries, you’ll should prefix them with “question:‎

The retrieval service is a FastAPI app. Given an enter question, it performs both BM25 and semantic search, deduplicates the outcomes, and returns the paperwork’ textual content and related title. The latter is used to link source documents when producing the draft.

To begin the OpenSearch node and semantic search + FastAPI server, we use a simple-docker compose file. They every run in their very own containers, bridged by a typical community. For comfort, we additionally outline widespread instructions in a Makefile.

Lastly, we combine with Obsidian by way of a TypeScript plugin. The obsidian-plugin-sample made it straightforward to get began and I added features to show retrieved paperwork in a brand new tab, question APIs, and stream the output. (I’m new to TypeScript so suggestions appreciated!)

What else can we apply this to?

Whereas this prototype makes use of native notes and journal entries, it’s not a stretch to think about the copilot retrieving from different paperwork (on-line). For instance, workforce paperwork corresponding to product necessities and technical design docs, inner wikis, and even code. I’d guess that’s what Microsoft, Atlassian, and Notion are engaged on proper now.

It additionally extends past private productiveness. Inside my subject of suggestions and search, researchers and practitioners are enthusiastic about layering LLM-based generation on high of present programs and merchandise to enhance the shopper expertise. (I anticipate we’ll see a few of this in manufacturing by the tip of the yr.)

Concepts for enchancment

One thought is to strive LLMs with bigger context sizes that permit us to feed in total paperwork as a substitute of chunks. This may increasingly assist with retrieval recall however places extra onus on the LLM to determine the related textual content. At the moment, I’m utilizing gpt-3.5-turbo which is an efficient stability of velocity and price however am excited to strive claude-1.3-100k and full paperwork as context.

One other thought is to reinforce retrieval with net or inner search when mandatory. For instance, when paperwork and notes go stale (e.g., primarily based on final up to date timestamp), we will search for the net or inner paperwork for newer info.

• • •

See Also

Right here’s the GitHub repo in case you’re eager to strive. Begin by cloning the repo and updating the trail to your obsidian-vault and huggingface hub cache. The latter saves us from downloading the tokenizer and mannequin every time you begin the containers.

git clone https://github.com/eugeneyan/obsidian-copilot.git

# Open Makefile and replace the next paths
export OBSIDIAN_PATH = /Customers/eugene/obsidian-vault/
export TRANSFORMER_CACHE = /Customers/eugene/.cache/huggingface/hub

Then, construct the picture and indices earlier than beginning the retrieval app.

# Construct the docker picture
make construct

# Begin the opensearch container and anticipate it to start out. 
# It is best to see one thing like this: [c6587bf83572] Node 'c6587bf83572' initialized
make opensearch

# In ANOTHER terminal, construct your artifacts (this will take some time)
make build-artifacts

# Begin the app. It is best to see this: Uvicorn working on http://0.0.0.0:8000
make run

Lastly, set up the copilot plugin, allow it in group plugin settings, and replace the API key. You’ll should restart your Obsidian app in case you had it open earlier than set up.

When you tried it, I’d love to listen to the way it went, particularly the place it didn’t work effectively and the way it may be improved. Or in case you’ve been working with retrieval-augmented era, I’d love to listen to about your expertise thus far!

To quote this content material, please use:

Yan, Ziyou. (Jun 2023). Obsidian-Copilot: A Prototype Assistant for Writing & Considering. eugeneyan.com.
https://eugeneyan.com/writing/obsidian-copilot/.

@article{yan2023copilot,
  title   = {Obsidian-Copilot: A Prototype Assistant for Writing & Considering},
  writer  = {Yan, Ziyou},
  journal = {eugeneyan.com},
  yr    = {2023},
  month   = {Jun},
  url     = {https://eugeneyan.com/writing/obsidian-copilot/}
}

Share on:

Be a part of 4,800+ readers getting updates on machine studying, engineering, and mechanisms.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top