Core Java

Java Sort Alphanumeric Strings Example

Sorting alphanumeric strings is a common requirement in applications where data contains both letters and numbers. Java provides built-in sorting mechanisms that can be used to sort such strings lexicographically. However, in some cases, natural sorting (which correctly orders numbers within strings) is required. Let us delve into understanding how Java sorts alphanumeric strings using case-insensitive ordering.

1. Understanding the Problem

Given a list of alphanumeric strings, we need to sort them in different ways:

  • Lexicographically (default string sorting)
  • Natural order (numeric portions sorted as numbers)
  • Case-insensitive sorting

1.1 Comparison

Sorting MethodAdvantagesWhen to UseDisadvantages
Lexicographically (default string sorting)Simple, default behavior in Java; follows Unicode order.Use when sorting purely alphabetical strings or when Unicode order is required.Does not handle numbers naturally; “apple10” comes before “apple2” due to character-by-character comparison.
Natural Order (numeric portions sorted as numbers)Sorts numbers correctly within strings; “apple2” comes before “apple10”.Use when dealing with alphanumeric data where numbers should be treated as numeric values.Requires additional logic to extract and compare numbers, which adds complexity.
Case-Insensitive SortingIgnores case differences, ensuring “Apple” and “apple” are treated the same.Use when sorting user-generated text or case-insensitive lists.May not be suitable if case-sensitive differentiation is required.

2. Lexicographic Sorting of Alphanumeric Strings

Lexicographic sorting follows ASCII order, meaning numbers come before letters and uppercase letters come before lowercase letters. The String::compareTo method is used for sorting, which arranges strings based on their Unicode values. Since ‘apple10’ comes before ‘apple2’ due to the comparison of characters one by one, this method does not yield a true numerical order for mixed alphanumeric strings.

01
02
03
04
05
06
07
08
09
10
11
12
import java.util.Arrays;
import java.util.List;
 
public class LexicographicSorting {
    public static void main(String[] args) {
        List<String> strings = Arrays.asList("apple1", "apple10", "apple2", "banana5", "banana3");
         
        strings.sort(String::compareTo);
         
        System.out.println("Lexicographically sorted: " + strings);
    }
}

2.1 Code Explanation and Output

The given Java program demonstrates lexicographic sorting of a list of strings using the String::compareTo method. It initializes a list containing "apple1", "apple10", "apple2", "banana5", and "banana3", then sorts it in natural (dictionary) order, meaning it compares characters one by one based on Unicode values. As a result, "apple10" appears before "apple2" because "1" in "apple10" comes before "2" in Unicode order, rather than treating numbers numerically. Finally, the sorted list is printed to the console.

1
Lexicographically sorted: [apple1, apple10, apple2, banana3, banana5]

3. Implementing Natural Alphanumeric Sorting

To sort alphanumeric strings naturally (i.e., ensuring ‘apple2’ appears before ‘apple10’), we need a custom comparator. This approach extracts the numeric portion of the string using a regular expression replaceAll("\\D", "") and converts it to an integer for proper numerical comparison. This method ensures that numeric values are compared numerically rather than lexicographically.

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
import java.util.Arrays;
import java.util.List;
import java.util.Comparator;
 
public class NaturalSorting {
    public static void main(String[] args) {
        List<String> strings = Arrays.asList("apple1", "apple10", "apple2", "banana5", "banana3");
         
        strings.sort(Comparator.comparingInt(NaturalSorting::extractNumber));
         
        System.out.println("Naturally sorted: " + strings);
    }
 
    private static int extractNumber(String s) {
        return Integer.parseInt(s.replaceAll("\\D", ""));
    }
}

3.1 Code Explanation and Output

The given Java program demonstrates the natural sorting of strings containing numbers using the Comparator.comparingInt method. It initializes a list with "apple1", "apple10", "apple2", "banana5", and "banana3", then sorts them based on the numerical values extracted from each string. The extractNumber method removes all non-digit characters using replaceAll("\\D", "") and converts the remaining digits into an integer. Unlike lexicographic sorting, this approach ensures that "apple2" comes before "apple10" because it compares the actual numeric values instead of character-by-character Unicode order. Finally, the sorted list is printed to the console.

1
Naturally sorted: [apple1, apple2, apple10, banana3, banana5]

4. Sorting Alphanumeric Strings Case-Insensitively

Since uppercase letters come before lowercase letters in ASCII order, a simple lexicographic sort may not provide the desired ordering for case-insensitive sorting. Java provides String.CASE_INSENSITIVE_ORDER, which allows sorting strings while ignoring case differences. This ensures that ‘Apple2’ and ‘apple1’ are compared without considering their case.

01
02
03
04
05
06
07
08
09
10
11
12
13
import java.util.Arrays;
import java.util.List;
import java.util.Comparator;
 
public class CaseInsensitiveSorting {
    public static void main(String[] args) {
        List<String> strings = Arrays.asList("Apple2", "apple10", "apple1", "Banana5", "banana3");
         
        strings.sort(String.CASE_INSENSITIVE_ORDER);
         
        System.out.println("Case-insensitively sorted: " + strings);
    }
}

4.1 Code Explanation and Output

The given Java program demonstrates case-insensitive sorting of a list of strings using String.CASE_INSENSITIVE_ORDER. It initializes a list containing "Apple2", "apple10", "apple1", "Banana5", and "banana3", then sorts them while ignoring differences in uppercase and lowercase letters. This ensures that words are ordered based on their dictionary sequence regardless of capitalization. As a result, "Apple2" and "apple1" are treated as equivalent to "apple10", and "Banana5" is grouped with "banana3" accordingly. Finally, the sorted list is printed to the console.

1
Case-insensitively sorted: [Apple2, apple1, apple10, Banana5, banana3]

5. Conclusion

Sorting alphanumeric strings in Java can be done in multiple ways, depending on the requirement. Lexicographic sorting uses the default compareTo method but does not handle numbers naturally. A custom comparator can be implemented to extract numeric values and sort them properly. Case-insensitive sorting can be achieved using String.CASE_INSENSITIVE_ORDER to ensure consistent results across different letter cases. By choosing the right approach, Java developers can efficiently sort alphanumeric data based on their application’s needs.

Yatin Batra

An experience full-stack engineer well versed with Core Java, Spring/Springboot, MVC, Security, AOP, Frontend (Angular & React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).
Subscribe
Notify of
guest


This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button