Textual content on Maps Assist Information
- Find out about new and experimental options which were launched to a subset of maps within the David Rumsey Map Assortment.
- Uncover what’s new, the way to benefit from the brand new obtainable instruments, and the way new computational strategies enabled this new approach search maps.
- Learn how to become involved.
Desk of Contents
Fast Begin
Search
Along with a standard search of Catalog Information (e.g. titles, authors, dates), the Luna Viewer now lets you search the textual content content material of many maps. We name this ‘Textual content on Maps”.
Solely maps which were geo-referenced could be searched by the textual content they include.
Utilizing the Superior Search, you’ll be able to carry out mixed searches of Catalog Information and Textual content on Maps. Within the Superior Search, you can also specify which model of the Textual content on Maps you wish to search. Learn extra about these choices right here.
View
Not solely are you able to view the textual content you looked for, you may also view all different textual content on this map and proper the underlying information!
Once you arrive from the search web page, you will note solely your search time period highlighted. To see or disguise all Textual content on Maps for this map, click on on the icon you see under within the prime left nook of the map picture.
Contribute
To edit a selected choice, click on on the highlighted textual content. Now, you will note the annotation pop up. To be taught extra about the way to annotate (right, affirm, and so on.) textual content on maps, learn this steering.
To edit an incorrect bounding polygon round textual content, click on on the highlighted textual content. Within the pop up, click on ‘edit’. You will notice the vertices of the polygon seem: you’ll be able to transfer these to enhance the way in which that the polygon surrounds the textual content, or you’ll be able to delete a polygon that has been created in error. Learn this steering for extra particulars.
Please be aware, you have to be logged in to edit the information.
Search Textual content on Maps
If you wish to do that new function, sort a phrase or phrase that pursuits you within the search discipline within the prime left of your browser window. Then, from the drop down menu, choose “Textual content on Maps”.
The opposite choices (“Catalog Information”, “Catalog Information & Textual content in paperwork”, and Superior Search) are reviewed in additional element under.
Search Ideas
- The preliminary immediate for looking out “Textual content on Maps” accepts 1-word queries. If you wish to seek for a number of phrases, please see the part on this under.
- Searches usually are not case delicate, nor can they settle for regex.
- Though we’re engaged on enhancing the efficiency of mapKurator for all languages, it’s at the moment not attainable to go looking Textual content on Maps for phrases in non-latin alphabets.
- Your search outcomes will likely be displayed in a random order that may change every time. So, in case you run the identical search greater than as soon as you will note the identical outcomes, however displayed in a special order.
- If you wish to refine the outcomes of your question with filters primarily based on Catalog Information, it’s worthwhile to use the Superior Search function.
When you choose “Textual content on Maps”, you’ll seek for occurrences of the phrase on the complete dataset. (Learn extra right here to study how this information was created.) For instance, in case you sort “Paris” you will note what number of instances that phrase is printed on any of the ~57,000 maps within the assortment that we’ve processed. (This consists of textual content inside and out of doors of the neatline, e.g. it consists of map titles and different descriptive data.) The searchable datasets represents content material on David Rumsey assortment maps which were digitized and georeferenced as much as 2022.
Be aware: Georeferencing means establishing factors on the scanned picture that time to places on the earth. These management factors enable content material on the digitized map to be geolocated. You can try georeferencing out yourself.
Seek for A number of Phrases
mapKurator output is saved solely as particular person phrases: there isn’t a prediction of phrases, for instance, for locations names that include a couple of phrase (e.g. “South Ponte Vedra Seaside”)
Nonetheless, multi-word search is feasible due to the way in which the information has been listed. Briefly, multi-word searches will likely be profitable when adjoining phrases are inside a 2-character size from the 2 factors of the bounding polygon which can be the furthest away from one another. That is primarily based primarily based on the dimensions of the characters within the least frequent phrase within the search, e.g. “Ponte” under.
Instance of multi-word search.
Superior Search
Superior looking out lets you mix queries of Textual content on Maps with Catalog Information, i.e. you’ll be able to leverage the ability of the search by textual content, but additionally filter the outcomes by their metadata. You may entry the superior search choices by clicking on the dropdown menu within the search discipline.
Clicking on the “Superior Search” possibility, you’ll entry a devoted interface the place you’ll be able to refine your queries within the assortment, utilizing the totally different fields. You may select a number of standards in your search.
-
Discover all these phrases: Click on on the drop down menu of the primary white field and choose one of many information fields from the David Rumsey Map Assortment catalog, for instance “nation”. Within the corresponding field on the fitting, you’ll be able to sort the worth that you simply wish to match, for instance “United States”. Consequently, you’ll get all maps which have each the phrases “United” and “States” as a worth for “nation”.
-
Discover any of those phrases: Click on on the drop down menu of the primary white field and choose one of many information fields from the David Rumsey Map Assortment catalog, for instance “nation”. Within the corresponding field on the fitting, you’ll be able to sort the worth that you simply wish to match, for instance “United States”. Consequently, you’ll get all maps which have the phrases “United” OR the phrase “States” as a worth for “nation”. (it will embrace, for instance, maps that includes the United Republic of Congo).
-
Discover this precise wording: Click on on the drop down menu of the primary white field and choose one of many information fields from the David Rumsey Map Assortment catalog, for instance “nation”. Within the corresponding field on the fitting, you’ll be able to sort the worth that you simply wish to match, for instance “United States of America”. Consequently, you’ll get ONLY the maps that precisely match your question and have “United States of America” as a worth (it will NOT embrace maps which have solely “United States” as a worth).
-
Date vary: Click on on the drop down menu to pick out the information discipline you wish to search. The choices are both “date” (the date the map was printed) or “pub date” (the publication date of the merchandise by which a map seems, for instance an atlas). Within the corresponding fields on the fitting, enter two dates that symbolize the start and finish of your date vary. For instance “1830” and “1850”. Consequently, you will note all of the maps which have a date (or pub date relying in your question) that falls in that vary. You may mix this filter with the earlier one.
-
Phrases in textual content on maps: click on on the drop down menu to decide on what sort of textual content on maps you wish to search.
The choices are:
- “Mapkurator output”: That is the uncooked output of the machine studying pipeline (mapKurator) utilized to the maps, in different phrases it’s the corpus of mapKurator’s predicted transcription of textual content inside its predicted bounding polygon. No subsequent edits or processes have been utilized.
- “MapKurator output (post-processed)”: On this dataset, the output from mapKurator has undergone one other processing step. Based mostly on the expected transcription and the geographic coordinates related to it, one other machine studying module makes an attempt to match the expected textual content towards a vocabulary of function names in Open Road Map. Content material on this discipline is at all times CAPITALIZED. This course of could assist scale back errors, however may also introduce new ones, and it may make some much less recognized options much less seen.
- “Person annotations”: Updates to mapKurator predictions created by customers such as you, by edits and/or validation of automated transcription generated by mapKurator. At present, this dataset could be very small however it’s going to develop with time. The index of annotations is up to date every day, so new annotations will not be instantly searchable.
- “All Textual content on Maps”: consists of all the above.
Then, within the discipline on the fitting, write the phrase you wish to search in any of the chosen choices for Textual content on Maps.
Please be aware: Within the common, non superior search, the “Textual content on Maps” searches the “MapKurator output (post-processed)” information by default.
Please additionally be aware: when you refine the search on this approach, it’s not attainable to additional manage the outcomes. To finish searches with extra advanced sorting and filtering, please use the “Catalog Information” search with out additionally together with the “Textual content on Maps” possibility.
Listed below are some examples:
- You may refine the search to provide outcomes just for maps that had been printed between 1700-1800 and the place the sheet accommodates the uncooked mapKurator output for “Paris”. Within the search field within the browser that is expressed as: pub_date=1700…1800 AND ocrText=”Paris” LIMIT:RUMSEY~8~1.
Pub_date
represents the date vary andocrText
represents the uncooked mapKurator output. - A variation on this search limits by publication date, however searches the post-processed mapKurator transcriptions as an alternative of the uncooked mapKurator output: pub_date=1500…1700 AND postOcrText=”France” LIMIT:RUMSEY~8~1
Joyful exploring!
View Search Outcomes
Masonry View
By default, you may be viewing the outcomes of your Textual content on Maps search in “masonry view”, i.e., a collage of all of the map textual content that matches your search.
In the event you hover the pointer over any end result’s “brick”, you will note a small preview of the complete map and a yellow pin signaling the place of that exact annotation in relation to the map.
The masonry view is a fast approach to examine the number of ways in which a phrase or phrases seems on maps from many cultures and centuries.
It’s additionally a useful approach to visualize errors within the automated textual content detection and recognition elements of the strategy creating this information. To be taught extra about this, you’ll be able to learn extra right here.
Tile View
You may also choose the “tile view”, to instantly see all thumbnails of the maps that match your search. This view is extra like the standard view you see when looking out through Catalog Information (e.g. the title).
In each the masonry or tile views, in case you click on on any of the map thumbnails, you may be taken to a bigger view of the map. Right here you may also see all of the machine-generated bounding bins across the labels and their transcriptions by clicking the strains icon within the toolbar (see annotating Textual content on Maps).
Grow to be a Contributor
Once you take a look at the machine-generated annotations (“MapKurator output” or “MapKurator output (post-processed)”), you could discover errors.
Maps, and the historic ones specifically, are a really difficult enter information supply for textual content detection and recognition (or, “textual content recognizing”, as utilized by mapKurator), and the standard of the outcomes will differ relying on the colour(s) of the background, the fonts, the printing approach, the language, the conservation standing, and so forth.
We invite all of the customers of the David Rumsey Map Assortment to workforce up with the machine and enhance or affirm the annotations.
In the event you spot a mistake, please contemplate contributing a greater transcription, and/or a extra correct bounding field.
Textual content transcription
You may repair a textual content label’s transcription in order that it matches what seems on the sheet.
Please be aware: The objective is to precisely replicate what’s on the map.
This consists of what could be perceived as “errors” given modifications to put names or different elements. Listed below are some pointers that can assist you:
- If there’s a typo on the map, or a spot identify has modified (primarily based in your information), otherwise you wish to write the place identify in a special language (transliteration), don’t make modifications that don’t mirror what’s on the map.
- Equally, don’t develop abbreviations.
- Please embrace punctuation or areas as related.
If you’re occupied with studying extra about greatest practices in annotating textual content on maps, the annotation pointers developed by Machines Studying Maps could be discovered here.
Enhancing could be very simple!
- Click on on the annotation you wish to enhance. You may view earlier transcriptions by unfolding the “n extra transcriptions” a part of the field.
- Click on on “Edit” so as to add a transcription that will likely be saved below your consumer identify.
- The bounding polygon and the textual content discipline will now be editable and you may change the textual content transcription as wanted.
- It’s attainable to cancel your modifications or to save lots of them by clicking on the related buttons.
- Afterwards, your transcription will likely be instantly seen (they are going to be searchable after 1 day).
Visible examples of enhancing:
Bounding polygons
When you choose an annotation, a number of factors alongside the polygon grow to be lively, and also you’ll be allowed to maneuver them round, altering the form of the polygon. Nonetheless, you aren’t at the moment ready to attract new polygons round textual content.
If the polygon incorrectly surrounds a phrase, you’ll be able to modify the polygon round that textual content.
Bounding polygons could also be incorrect when:
- They don’t embrace all of the characters of 1 phrase
- Two polygons overlap/duplicate the transcription of a single phrase
- Fail to seize a textual content label in any respect
Visible examples of those circumstances:
What occurs to my annotations?
Annotations are saved and can seem to the general public on-line instantly, nevertheless modifications will likely be searchable solely after a delay of 1 day. These modifications are logged alongside the present information: nothing is faraway from the underlying information.
Your modifications will likely be recognized by your consumer identify.
As soon as the textual content and the bounding polygon have been confirmed by 1 consumer, a inexperienced verify will seem subsequent to the transcription. This operates as a information to future customers in order that they’ll focus elsewhere.