JAVA资料 ·

JDK1.8.0_161_64源码解析系列之java.lang.String

1、接口解释

(1)Serializable 这个序列化接口没有任何方法和域,仅用于标识序列化的语意。

(2)Comparable<String> 用于对两个实例化对象比较大小

(3)CharSequence 这个接口是一个只读的字符序列。包括length(), 
charAt(int index), subSequence(int start, int end)这几个API接口

2、主要变量

(1)private final char value[];
可以看到,value[]是存储String的内容的,即当使用String str = "abcd";
的时候,本质上,"abcd"是存储在一个char类型的数组中的。

(2) private int hash;

而hash是String实例化的hashcode的一个缓存。因为String经常被用于比较,比如在HashMap中。
如果每次进行比较都重新计算hashcode的值的话,那无疑是比较麻烦的,而保存一个hashcode的缓存无疑能优化这样的操作。

(3)private static final long serialVersionUID = -6849794470754667710L;

Java的序列化机制是通过在运行时判断类的serialVersionUID来验证版本一致性的。在进行反序列化时,JVM会把传来
的字节流中的serialVersionUID与本地相应实体(类)的serialVersionUID进行比较,如果相同就认为是一致的,可以进行反序
列化,否则就会出现序列化版本不一致的异常,如果我们不希望通过编译来强制划分软件版本,即实现序列化接口的实体能够兼容先前版本,
未作更改的类,就需要显式地定义一个名为serialVersionUID,类型为long的变量,不修改这个变量值的序列化实体都可以相互进行
串行化和反串行化

(4) private static final ObjectStreamField[] serialPersistentFields =

    new ObjectStreamField[0];

3、构造方法

(1) public String()
(2) public String(String original) 
(3) public String(char value[])
(4) public String(char value[], int offset, int count)
(5) public String(int[] codePoints, int offset, int count)
(6) public String(byte ascii[], int hibyte, int offset, int count)
(7) public String(byte ascii[], int hibyte) 
(8) public String(byte bytes[], int offset, int length, String charsetName)
(9) public String(byte bytes[], int offset, int length, Charset charset) 
(10)public String(byte bytes[], String charsetName)
(11)public String(byte bytes[], Charset charset) 
(12)public String(byte bytes[], int offset, int length)
(13)public String(byte bytes[]) 
(14)public String(StringBuffer buffer)
(15)public String(StringBuilder builder)
String支持多种初始化方法,包括接收String,char[],byte[],StringBuffer等多种参数类型的初始化方法。
但本质上,其实就是将接收到的参数传递给全局变量value[]。

4、内部方法

(1)public int length() {   
      return value.length;
}
(2)public boolean isEmpty() { 
   return value.length == 0;
}
(3)public char charAt(int index) {    
      if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
  }

知道了String其实内部是通过char[]实现的,那么就不难发现length(),isEmpty(),charAt()这些方法其实就是在内部调用数组的方法。

(4)//返回指定索引的代码点
public int codePointAt(int index) { 
  if ((index < 0) || (index >= value.length)) {
       throw new StringIndexOutOfBoundsException(index);
   }
   return Character.codePointAtImpl(value, index, value.length);

}

//返回指定索引前一个代码点
(5) public int codePointBefore(int index) {
    int i = index - 1;
    if ((i < 0) || (i >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return Character.codePointBeforeImpl(value, index, 0);
}
//返回指定起始到结束段内字符个数
(6)public int codePointCount(int beginIndex, int endIndex) {
    if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
        throw new IndexOutOfBoundsException();
    }
    return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
 }
  //返回指定索引加上codepointOffset后得到的索引值
(7)public int offsetByCodePoints(int index, int codePointOffset) {
    if (index < 0 || index > value.length) {
        throw new IndexOutOfBoundsException();
    }
    return Character.offsetByCodePointsImpl(value, 0, value.length,
            index, codePointOffset);
}

//将字符串复制到dst数组中,复制到dst数组中的起始位置可以指定。值得注意的是,该方法并没有检测复制到dst数组后是否越界。
(8) void getChars(char dst[], int dstBegin) {
    System.arraycopy(value, 0, dst, dstBegin, value.length);
}
(9) public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
    if (srcBegin < 0) {
        throw new StringIndexOutOfBoundsException(srcBegin);
    }
    if (srcEnd > value.length) {
        throw new StringIndexOutOfBoundsException(srcEnd);
    }
    if (srcBegin > srcEnd) {
        throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
    }
    System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
}
//获取当前字符串的二进制
(10) public void getBytes(int srcBegin, int srcEnd, byte dst[], int dstBegin) {
    if (srcBegin < 0) {
        throw new StringIndexOutOfBoundsException(srcBegin);
    }
    if (srcEnd > value.length) {
        throw new StringIndexOutOfBoundsException(srcEnd);
    }
    if (srcBegin > srcEnd) {
        throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
    }
    Objects.requireNonNull(dst);

    int j = dstBegin;
    int n = srcEnd;
    int i = srcBegin;
    char[] val = value;   /* avoid getfield opcode */

    while (i < n) {
        dst[j++] = (byte)val[i++];
    }
}
(11)public byte[] getBytes(String charsetName)
        throws UnsupportedEncodingException {
    if (charsetName == null) throw new NullPointerException();
    return StringCoding.encode(charsetName, value, 0, value.length);
}
(12)public byte[] getBytes() {  
     return StringCoding.encode(value, 0, value.length); 
}
(13) public boolean equals(Object anObject) {  
  if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])
                    return false;
                i++;
            }
            return true;
        }
    }
    return false;
}
(14)public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }
(15)public boolean contentEquals(CharSequence cs) {
    // Argument is a StringBuffer, StringBuilder
    if (cs instanceof AbstractStringBuilder) {
        if (cs instanceof StringBuffer) {
            synchronized(cs) {
               return nonSyncContentEquals((AbstractStringBuilder)cs);
            }
        } else {
            return nonSyncContentEquals((AbstractStringBuilder)cs);
        }
    }
    // Argument is a String
    if (cs instanceof String) {
        return equals(cs);
    }
    // Argument is a generic CharSequence
    char v1[] = value;
    int n = v1.length;
    if (n != cs.length()) {
        return false;
    }
    for (int i = 0; i < n; i++) {
        if (v1[i] != cs.charAt(i)) {
            return false;
        }
    }
    return true;
   }

   这个主要是用来比较String和StringBuffer或者StringBuild的内容是否一样。可以看到传入参数是CharSequence ,
   这也说明了StringBuffer和StringBuild同样是实现了CharSequence。源码中先判断参数是从哪一个类实例化来的,
   再根据不同的情况采用不同的方案,不过其实大体都是采用上面那个for循环的方式来进行判断两字符串是否内容相同。
(16)public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2);
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {
            return c1 - c2;
        }
        k++;
    }
    return len1 - len2;
   }

   这个就是String对Comparable接口中方法的实现了。其核心就是那个while循环,通过从第一个开始比较每一个字符,
   当遇到第一个较小的字符时,判定该字符串小。
(17)public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2);
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {
            return c1 - c2;
        }
        k++;
    }
    return len1 - len2;
}
这个就是String对Comparable接口中方法的实现了。其核心就是那个while循环,通过从第一个开始比较每一个字符,当遇到第一个较小的字符时,判定该字符串小
 (18)public int compareToIgnoreCase(String str) {
    return CASE_INSENSITIVE_ORDER.compare(this, str);
}
这个也是比较字符串大小,规则和上面那个比较方法基本相同,差别在于这个方法忽略大小写
(19)public boolean regionMatches(int toffset, String other, int ooffset,int len) {
    char ta[] = value;
    int to = toffset;
    char pa[] = other.value;
    int po = ooffset;
    // Note: toffset, ooffset, or len might be near -1>>>1.
    if ((ooffset < 0) || (toffset < 0)
            || (toffset > (long)value.length - len)
            || (ooffset > (long)other.value.length - len)) {
        return false;
    }
    while (len-- > 0) {
        if (ta[to++] != pa[po++]) {
            return false;
        }
    }
    return true;
}
比较该字符串和其他一个字符串从分别指定地点开始的n个字符是否相等。看代码可知道,其原理还是通过一个while去循环对应的比较区域进行判断,但在比较之前会做判定,判定给定参数是否越界。
(20) public boolean startsWith(String prefix, int toffset) {
        char ta[] = value;
        int to = toffset;
        char pa[] = prefix.value;
        int po = 0;
        int pc = prefix.value.length;
        // Note: toffset might be near -1>>>1.
        if ((toffset < 0) || (toffset > value.length - pc)) {
            return false;
        }
        while (--pc >= 0) {
            if (ta[to++] != pa[po++]) {
                return false;
            }
        }
        return true;
    }  
判断当前字符串是否以某一段其他字符串开始的,和其他字符串比较方法一样,其实就是通过一个while来循环比较。
(21)public int indexOf(int ch, int fromIndex) {
        final int max = value.length;
        if (fromIndex < 0) {
            fromIndex = 0;
        } else if (fromIndex >= max) {
            // Note: fromIndex might be near -1>>>1.
            return -1;
        }

        if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
            // handle most cases here (ch is a BMP code point or a
            // negative value (invalid code point))
            final char[] value = this.value;
            for (int i = fromIndex; i < max; i++) {
                if (value[i] == ch) {
                    return i;
                }
            }
            return -1;
        } else {
            return indexOfSupplementary(ch, fromIndex);
        }
    }
(22) public int indexOf(int ch) {
        return indexOf(ch, 0);
    }
(23) public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }
这个方法可以返回字符串中一个子串,看最后一行可以发现,其实就是指定头尾,然后构造一个新的字符串。
(24) public String concat(String str) {
        int otherLen = str.length();
        if (otherLen == 0) {
            return this;
        }
        int len = value.length;
        char buf[] = Arrays.copyOf(value, len + otherLen);
        str.getChars(buf, len);
        return new String(buf, true);
    }

    concat的作用是将str拼接到当前字符串后面,通过代码也可以看出其实就是建一个新的字符串。
(25) public String replace(char oldChar, char newChar) {
        if (oldChar != newChar) {
            int len = value.length;
            int i = -1;
            char[] val = value; /* avoid getfield opcode */

            while (++i < len) {
                if (val[i] == oldChar) {
                    break;
                }
            }
            if (i < len) {
                char buf[] = new char[len];
                for (int j = 0; j < i; j++) {
                    buf[j] = val[j];
                }
                while (i < len) {
                    char c = val[i];
                    buf[i] = (c == oldChar) ? newChar : c;
                    i++;
                }
                return new String(buf, true);
            }
        }
        return this;
    }
    替换操作,主要是将原来字符串中的oldChar全部替换成newChar。看这里实现,主要是先找到第一个所要替换的字符串的位置 i ,
    将i之前的字符直接复制到一个新char数组。然后从 i 开始再对每一个字符进行判断是不是所要替换的字符。
(26) public String trim() {
        int len = value.length;
        int st = 0;
        char[] val = value;    /* avoid getfield opcode */

        while ((st < len) && (val[st] <= ' ')) {
            st++;
        }
        while ((st < len) && (val[len - 1] <= ' ')) {
            len--;
        }
        return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
    }

    这个函数平时用的应该比较多,删除字符串前后的空格,原理是通过找出前后第一个不是空格的字符串,返回原字符串的该子串。
(27) public CharSequence subSequence(int beginIndex, int endIndex) {
        return this.substring(beginIndex, endIndex);
    }
    返回一个新的字符类型的字符串
(28) public boolean startsWith(String prefix, int toffset) {
      char ta[] = value;
      int to = toffset;
      char pa[] = prefix.value;
      int po = 0;
      int pc = prefix.value.length;
      // Note: toffset might be near -1>>>1.
      //如果起始地址小于0或者(起始地址+所比较对象长度)大于自身对象长度,返回假
      if ((toffset < 0) || (toffset > value.length - pc)) {
          return false;
      }
      //从所比较对象的末尾开始比较
      while (--pc >= 0) {
          if (ta[to++] != pa[po++]) {
              return false;
          }
      }
      return true;
  }
(29) public boolean startsWith(String prefix) {
      return startsWith(prefix, 0);
  }
(30) public boolean endsWith(String suffix) { 
     return startsWith(suffix, value.length - suffix.value.length);
  }
 起始比较和末尾比较都是比较经常用得到的方法,例如在判断一个字符串是不是http协议的,
 或者初步判断一个文件是不是mp3文件,都可以采用这个方法进行比较。

(31) public native String intern();

intern方法是Native调用,它的作用是在方法区中的常量池里通过equals方法寻找等值的对象,
如果没有找到则在常量池中开辟一片空间存放字符串并返回该对应String的引用,
否则直接返回常量池中已存在String对象的引用

(32) public String toLowerCase() {
    return toLowerCase(Locale.getDefault());
 }
(33) public String toUpperCase(Locale locale) {
    if (locale == null) {
        throw new NullPointerException();
    }

    int firstLower;
    final int len = value.length;

    /* Now check if there are any characters that need to be changed. */
    scan: {
        for (firstLower = 0 ; firstLower < len; ) {
            int c = (int)value[firstLower];
            int srcCount;
            if ((c >= Character.MIN_HIGH_SURROGATE)
                    && (c <= Character.MAX_HIGH_SURROGATE)) {
                c = codePointAt(firstLower);
                srcCount = Character.charCount(c);
            } else {
                srcCount = 1;
            }
            int upperCaseChar = Character.toUpperCaseEx(c);
            if ((upperCaseChar == Character.ERROR)
                    || (c != upperCaseChar)) {
                break scan;
            }
            firstLower += srcCount;
        }
        return this;
    }

    /* result may grow, so i+resultOffset is the write location in result */
    int resultOffset = 0;
    char[] result = new char[len]; /* may grow */

    /* Just copy the first few upperCase characters. */
    System.arraycopy(value, 0, result, 0, firstLower);

    String lang = locale.getLanguage();
    boolean localeDependent =
            (lang == "tr" || lang == "az" || lang == "lt");
    char[] upperCharArray;
    int upperChar;
    int srcChar;
    int srcCount;
    for (int i = firstLower; i < len; i += srcCount) {
        srcChar = (int)value[i];
        if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&
            (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
            srcChar = codePointAt(i);
            srcCount = Character.charCount(srcChar);
        } else {
            srcCount = 1;
        }
        if (localeDependent) {
            upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);
        } else {
            upperChar = Character.toUpperCaseEx(srcChar);
        }
        if ((upperChar == Character.ERROR)
                || (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
            if (upperChar == Character.ERROR) {
                if (localeDependent) {
                    upperCharArray =
                            ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);
                } else {
                    upperCharArray = Character.toUpperCaseCharArray(srcChar);
                }
            } else if (srcCount == 2) {
                resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;
                continue;
            } else {
                upperCharArray = Character.toChars(upperChar);
            }

            /* Grow result if needed */
            int mapLen = upperCharArray.length;
            if (mapLen > srcCount) {
                char[] result2 = new char[result.length + mapLen - srcCount];
                System.arraycopy(result, 0, result2, 0, i + resultOffset);
                result = result2;
            }
            for (int x = 0; x < mapLen; ++x) {
                result[i + resultOffset + x] = upperCharArray[x];
            }
            resultOffset += (mapLen - srcCount);
        } else {
            result[i + resultOffset] = (char)upperChar;
        }
    }
    return new String(result, 0, len + resultOffset);
}
将字符串的大写转换为小写,将字符串小写转换为大写
(34)public char[] toCharArray() {
      char result[] = new char[value.length];
      System.arraycopy(value, 0, result, 0, value.length);
      return result;
  }
将字符串转化为数组,由于本身就是数组的形式,只需将其拷贝即可。
(35) public String[] split(String regex, int limit) {    
    char ch = 0;
    if (((regex.value.length == 1 &&
         ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
         (regex.length() == 2 &&
          regex.charAt(0) == '\\' &&
          (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
          ((ch-'a')|('z'-ch)) < 0 &&
          ((ch-'A')|('Z'-ch)) < 0)) &&
        (ch < Character.MIN_HIGH_SURROGATE ||
         ch > Character.MAX_LOW_SURROGATE))
    {
        int off = 0;
        int next = 0;
        boolean limited = limit > 0;
        ArrayList<String> list = new ArrayList<>();
        while ((next = indexOf(ch, off)) != -1) {
            if (!limited || list.size() < limit - 1) {
                list.add(substring(off, next));
                off = next + 1;
            } else {    // last one
                //assert (list.size() == limit - 1);
                list.add(substring(off, value.length));
                off = value.length;
                break;
            }
        }
        // If no match was found, return this
        if (off == 0)
            return new String[]{this};

        // Add remaining segment
        if (!limited || list.size() < limit)
            list.add(substring(off, value.length));

        // Construct result
        int resultSize = list.size();
        if (limit == 0) {
            while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                resultSize--;
            }
        }
        String[] result = new String[resultSize];
        return list.subList(0, resultSize).toArray(result);
    }
    return Pattern.compile(regex).split(this, limit);
}
(36)public String[] split(String regex) {
    return split(regex, 0);
 }

对于字符串 "boo:and:foo",regex为o,limit为5时,
splite方法首先去字符串里查找regex——o,然后把o做为分隔符,
逐个把o去掉并且把字符串分开,比如,发现b后面有一个o,于是把这个o去掉,
并且把字符串拆成"b", "o:and:foo"两个字符串(注意:b后面的两个o已经去掉了一个),
接下来看"o:and:foo"这个字符串,第一个字符就是o,
于是o前面相当于一个空串,把这个o去掉,"o:and:foo"被分开成"", ":and:foo"这样两个字符串,
以此类推循环5次就是splite("o", 5)方法的作用
(37) public static String join(CharSequence delimiter, CharSequence... elements) {
    Objects.requireNonNull(delimiter);
    Objects.requireNonNull(elements);
    // Number of elements not likely worth Arrays.stream overhead.
    StringJoiner joiner = new StringJoiner(delimiter);
    for (CharSequence cs: elements) {
        joiner.add(cs);
    }
    return joiner.toString();
 }
(38) public static String join(CharSequence delimiter,        Iterable<? extends CharSequence> elements) {
    Objects.requireNonNull(delimiter);
    Objects.requireNonNull(elements);
    StringJoiner joiner = new StringJoiner(delimiter);
    for (CharSequence cs: elements) {
        joiner.add(cs);
    }
    return joiner.toString();
}
将已给的数组用给定的字符进行分开,返回一个特定的字符串,例如
String [] strings = {"a","b","c","d"};
 info =  String.join( "a",strings );
 System.out.println( info );
打印的结果为 aabacad
(39) public static String format(String format, Object... args) {  
      return new Formatter().format(format, args).toString();
  }
(40) public static String format(Locale l, String format, Object... args) { 
   return new Formatter(l).format(format, args).toString();
  }
  将字符串串格式化为一定格式的形式返回给

(41) 各种各样的 valueOf(Object obj) 方法

作用就是将相应格式的对象转换为你字符串的格式