As part of my move from Sphinx to Postgres Full-Text Search I needed a way to normalize accented characters. My data contains lots of diacritics, a common example is the varietal name “GrĂ¼ner Veltliner”.

My users do not want to enter that Umlaut each time they want to search for this varietal.

Fortunately there is an awesome Postgres contribution package called “unaccent” which replaces diacritics with their plain text equivalent, effectively normalizing the data set.

Using the package is pretty straight-forward. We first install the extension and then create a new text search configuration and ensure that we use it in all indexing and searching.

When searching make sure we reference the new configuration instead of the default ‘english’: