Allow me to ask you question first. Do you know there’s length limit for every column in indexes? If you know the answer, you can easily understand the background of the whole issue.
According to MySQL documentation, the length limit is 767 bytes for every string-typed column in indexes. In terms of characters, the length limit is 255 or 191 for UTF8 or UTF8MB4 columns respectively. For further information about the limit, you may refer to this post:
How to Resolve ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
That is to say, if you create a unique index for a UTF8 column with maximum length limit 255 characters. The index will only inspect the first 255 characters of input strings for potential unique violations before insert, and they don’t care the following characters after the first 255.
In such case, as long as the first 255 characters are duplicated, the row is duplicated, no matter what is followed after the first 255. The non-duplicated rows are duplicated from the perspective of unique constraints. Sounds ridiculous? I have a solution for you.
If you still need unique constraints on this column. One solution that I can think about to conquer the problem so far is to split the column into multiple columns with 255 characters, then create a unique constraint to combine these columns.
For example, a former column called group_name and typed as varchar(1020) can be split into 3 columns:
- subgroup1 varchar(255)
- subgroup2 varchar(255)
- subgroup3 varchar(255)
mysql> alter table <table_name> add unique index <index_name> (subgroup1, subgroup2, subgroup3);