Phrase-As-Picture for Semantic Typography

Our word-as-image illustrations think about altering solely the geometry of the letters to
convey the which means.
We intentionally don’t change coloration or texture and don’t use elaborations.
This permits easy, concise, black-and-white designs that convey the semantics clearly.
We depend on the prior of a pretrained Steady Diffusion mannequin to attach between textual content and pictures, and make the most of
the Rating Distillation Sampling method to encourage the looks of the letter to mirror the
offered textual idea.
Given an enter phrase, our technique is utilized individually for every letter.
We symbolize every letter as a closed vectorized form.

Given an enter letter represented by a set of management factors ????, and an idea (proven in purple),
our objective is to optimize its parameters to mirror the which means of the phrase, whereas nonetheless preserving its unique fashion and design.
we optimize the brand new positions ????ˆ of the deformed letter iteratively. At every iteration, we use a differentiable rasterizer (DiffVG marked in blue) that enables to backpropagate gradients from a raster-based loss to
the form’s parameters.
We then augmented the rasterized deformed letter and handed right into a pretrained frozen Steady
Diffusion mannequin, that drives the letter form to convey the semantic idea utilizing the Lsds loss (1).
To protect the form of the unique letter and guarantee legibility
of the phrase, we make the most of two further loss capabilities. The primary loss
preserves the native tone and construction of the
letter by evaluating the low-pass filter (LPF marked in yellow) of the ensuing rasterized
letter to the unique one to compute L???????????????? (2).
The second loss regulates the form modification by constraining the deformation
to be as-conformal-as-possible over a triangulation of the letter’s
form (D marked in inexperienced), defining L???????????????? (3).