Java Sort Alphanumeric Strings Example
Sorting alphanumeric strings is a common requirement in applications where data contains both letters and numbers. Java provides built-in sorting mechanisms that can be used to sort such strings lexicographically. However, in some cases, natural sorting (which correctly orders numbers within strings) is required. Let us delve into understanding how Java sorts alphanumeric strings using case-insensitive ordering.
1. Understanding the Problem
Given a list of alphanumeric strings, we need to sort them in different ways:
- Lexicographically (default string sorting)
- Natural order (numeric portions sorted as numbers)
- Case-insensitive sorting
1.1 Comparison
Sorting Method | Advantages | When to Use | Disadvantages |
---|---|---|---|
Lexicographically (default string sorting) | Simple, default behavior in Java; follows Unicode order. | Use when sorting purely alphabetical strings or when Unicode order is required. | Does not handle numbers naturally; “apple10” comes before “apple2” due to character-by-character comparison. |
Natural Order (numeric portions sorted as numbers) | Sorts numbers correctly within strings; “apple2” comes before “apple10”. | Use when dealing with alphanumeric data where numbers should be treated as numeric values. | Requires additional logic to extract and compare numbers, which adds complexity. |
Case-Insensitive Sorting | Ignores case differences, ensuring “Apple” and “apple” are treated the same. | Use when sorting user-generated text or case-insensitive lists. | May not be suitable if case-sensitive differentiation is required. |
2. Lexicographic Sorting of Alphanumeric Strings
Lexicographic sorting follows ASCII order, meaning numbers come before letters and uppercase letters come before lowercase letters. The String::compareTo method is used for sorting, which arranges strings based on their Unicode values. Since ‘apple10’ comes before ‘apple2’ due to the comparison of characters one by one, this method does not yield a true numerical order for mixed alphanumeric strings.
01 02 03 04 05 06 07 08 09 10 11 12 | import java.util.Arrays; import java.util.List; public class LexicographicSorting { public static void main(String[] args) { List<String> strings = Arrays.asList( "apple1" , "apple10" , "apple2" , "banana5" , "banana3" ); strings.sort(String::compareTo); System.out.println( "Lexicographically sorted: " + strings); } } |
2.1 Code Explanation and Output
The given Java program demonstrates lexicographic sorting of a list of strings using the String::compareTo
method. It initializes a list containing "apple1"
, "apple10"
, "apple2"
, "banana5"
, and "banana3"
, then sorts it in natural (dictionary) order, meaning it compares characters one by one based on Unicode values. As a result, "apple10"
appears before "apple2"
because "1"
in "apple10"
comes before "2"
in Unicode order, rather than treating numbers numerically. Finally, the sorted list is printed to the console.
1 | Lexicographically sorted: [apple1, apple10, apple2, banana3, banana5] |
3. Implementing Natural Alphanumeric Sorting
To sort alphanumeric strings naturally (i.e., ensuring ‘apple2’ appears before ‘apple10’), we need a custom comparator. This approach extracts the numeric portion of the string using a regular expression replaceAll("\\D", "")
and converts it to an integer for proper numerical comparison. This method ensures that numeric values are compared numerically rather than lexicographically.
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 | import java.util.Arrays; import java.util.List; import java.util.Comparator; public class NaturalSorting { public static void main(String[] args) { List<String> strings = Arrays.asList( "apple1" , "apple10" , "apple2" , "banana5" , "banana3" ); strings.sort(Comparator.comparingInt(NaturalSorting::extractNumber)); System.out.println( "Naturally sorted: " + strings); } private static int extractNumber(String s) { return Integer.parseInt(s.replaceAll( "\\D" , "" )); } } |
3.1 Code Explanation and Output
The given Java program demonstrates the natural sorting of strings containing numbers using the Comparator.comparingInt
method. It initializes a list with "apple1"
, "apple10"
, "apple2"
, "banana5"
, and "banana3"
, then sorts them based on the numerical values extracted from each string. The extractNumber
method removes all non-digit characters using replaceAll("\\D", "")
and converts the remaining digits into an integer. Unlike lexicographic sorting, this approach ensures that "apple2"
comes before "apple10"
because it compares the actual numeric values instead of character-by-character Unicode order. Finally, the sorted list is printed to the console.
1 | Naturally sorted: [apple1, apple2, apple10, banana3, banana5] |
4. Sorting Alphanumeric Strings Case-Insensitively
Since uppercase letters come before lowercase letters in ASCII order, a simple lexicographic sort may not provide the desired ordering for case-insensitive sorting. Java provides String.CASE_INSENSITIVE_ORDER, which allows sorting strings while ignoring case differences. This ensures that ‘Apple2’ and ‘apple1’ are compared without considering their case.
01 02 03 04 05 06 07 08 09 10 11 12 13 | import java.util.Arrays; import java.util.List; import java.util.Comparator; public class CaseInsensitiveSorting { public static void main(String[] args) { List<String> strings = Arrays.asList( "Apple2" , "apple10" , "apple1" , "Banana5" , "banana3" ); strings.sort(String.CASE_INSENSITIVE_ORDER); System.out.println( "Case-insensitively sorted: " + strings); } } |
4.1 Code Explanation and Output
The given Java program demonstrates case-insensitive sorting of a list of strings using String.CASE_INSENSITIVE_ORDER
. It initializes a list containing "Apple2"
, "apple10"
, "apple1"
, "Banana5"
, and "banana3"
, then sorts them while ignoring differences in uppercase and lowercase letters. This ensures that words are ordered based on their dictionary sequence regardless of capitalization. As a result, "Apple2"
and "apple1"
are treated as equivalent to "apple10"
, and "Banana5"
is grouped with "banana3"
accordingly. Finally, the sorted list is printed to the console.
1 | Case-insensitively sorted: [Apple2, apple1, apple10, Banana5, banana3] |
5. Conclusion
Sorting alphanumeric strings in Java can be done in multiple ways, depending on the requirement. Lexicographic sorting uses the default compareTo
method but does not handle numbers naturally. A custom comparator can be implemented to extract numeric values and sort them properly. Case-insensitive sorting can be achieved using String.CASE_INSENSITIVE_ORDER
to ensure consistent results across different letter cases. By choosing the right approach, Java developers can efficiently sort alphanumeric data based on their application’s needs.