e431. parsing character-separated data with a regular expression.txt

来自「这里面包含了一百多个JAVA源文件」· 文本 代码 · 共 24 行

TXT
24
字号
A line from a flat-file is typically formatted using a separator character to separate the fields. If the separator is simply a comma, tab, or single character, the StringTokenizer class can be used to parse the line into fields. If the separator is more complex (e.g., a space after a comma), a regular expression is needed. String.split() conveniently parses a line using a regular expression to specify the separator. 
String.split() returns only the nondelimiter strings. To obtain the delimiter strings, see e432 Parsing a String into Tokens Using a Regular Expression. 

Note: The StringTokenizer does not conveniently handle empty fields properly. For example, given the line a,,b, rather than return three fields (the second being empty), the StringTokenizer returns two fields, discarding the empty field. String.split() properly handles empty fields. 

    // Parse a comma-separated string
    String inputStr = "a,,b";
    String patternStr = ",";
    String[] fields = inputStr.split(patternStr);
    // ["a", "", "b"]
    
    // Parse a line whose separator is a comma followed by a space
    inputStr = "a, b, c,d";
    patternStr = ", ";
    fields = inputStr.split(patternStr, -1);
    // ["a", "b", "c,d"]
    
    // Parse a line with and's and or's
    inputStr = "a, b, and c";
    patternStr = "[, ]+(and|or)*[, ]*";
    fields = inputStr.split(patternStr, -1);
    // ["a", "b", "c"]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?