Make title sorting ignore diacriticals #8736
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #8722
Title sorting now ignores diacriticals.
Technical
This solution replaces the lowercase filter on the text_title_sort SOLR type with the icuFolding filter, which not only lowercases, but also removes diacritical marks from the search tokens according to Unicode standards.
A SOLR reindex will be required for this change to take effect.
Testing
The test case in the issue, https://openlibrary.org/authors/OL291650A/Max_Ernst?q=&sort=title should now colllate Dechets and Déchets together.
Screenshot
I add some works in my development instance to test this behavior.
Before this change, the "With Mark Twain" work sorted between "Wit" and "Wít".
Stakeholders
@mheiman @tfmorris @cdrini