PrintWriter装饰FileWriter后，对字符串的默认编码方式

相关文章推荐

刚毅的炒面 · 真·天域系列 -天下3官网· 1 年前 ·

路过的毛衣 · 中国驻阿德莱德总领馆提醒中国公民谨防电信诈骗 ...· 1 年前 ·

个性的韭菜 · 有史以来最棒的啤酒运送-哔哩哔哩_Bilibili· 2 年前 ·

温柔的上铺 · K-Car版埃尔法？未奥纯电新车3.99万起 ...· 2 年前 ·

孤独求败 · 李宇春简要资料· 3 年前 ·

public class test2 { public static void main ( String [ ] args ) throws IOException { PrintWriter out = new PrintWriter ( new BufferedWriter ( new FileWriter ( "BasicFileOutput.out" ) ) ) ; out . println ( "我是第一行" ) ; out . close ( ) ; // Show the stored file: System . out . println ( new BufferedReader ( new FileReader ( "BasicFileOutput.out" ) ) . readLine ( ) ) ;

上面程序对一个文件进行写入，我们知道Reader是处理字符的，但最终存入到文件里是需要通过编码把字符变成对应若干字节的。

我们知道IO体系使用了装饰器模式，而PrintWriter和BufferedWriter都是装饰类，都是为了拓展功能的。

通过对


    out.println

ctrl+点击追踪源码，能发现装饰类最终都会调用到自身一个


    Writer

类型的成员的


    write

函数上。主要过程就是：PrintWriter对象去调用BufferedWriter的write，BufferedWriter对象去调用FileWriter的write。 所以最终应该看FileWriter的write实现。

去查看FileWriter的源码发现根本没有write函数，原来write函数在其父类OutputStreamWriter里就写好了。发现其调用了StreamEncoder类型成员变量se的write函数。

//OutputStreamWriter
	private final StreamEncoder se;
    public void write(char cbuf[], int off, int len) throws IOException {
        se.write(cbuf, off, len);
再去看StreamEncoder的write实现：
//StreamEncoder
    public void write(char cbuf[], int off, int len) throws IOException {
        synchronized (lock) {
            ensureOpen();
            if ((off < 0) || (off > cbuf.length) || (len < 0) ||
                ((off + len) > cbuf.length) || ((off + len) < 0)) {
                throw new IndexOutOfBoundsException();
            } else if (len == 0) {
                return;
            implWrite(cbuf, off, len);//调用下面的函数
    void implWrite(char cbuf[], int off, int len)
        throws IOException
        CharBuffer cb = CharBuffer.wrap(cbuf, off, len);
        if (haveLeftoverChar)
        flushLeftoverChar(cb, false);
        while (cb.hasRemaining()) {
        CoderResult cr = encoder.encode(cb, bb, false);//关键。调用了encoder成员的encode函数。这里打断点
        if (cr.isUnderflow()) {
           assert (cb.remaining() <= 1) : cb.remaining();
           if (cb.remaining() == 1) {
                haveLeftoverChar = true;
                leftoverChar = cb.get();
            break;
        if (cr.isOverflow()) {
            assert bb.position() > 0;
            writeBytes();
            continue;
        cr.throwException();
这句CoderResult cr = encoder.encode(cb, bb, false)打完断点的截图如下，可以看到encoder是UTF-8。似乎这样就可以结束分析，但是我们还是没有搞清楚UTF-8到底怎么来的。所以接着分析。

既然encoder是StreamEncoder的成员变量，那么我们看一下它的构造器是否为encoder赋了值：
//StreamEncoder
    private StreamEncoder(OutputStream out, Object lock, CharsetEncoder enc) {//在这里打断点
        super(lock);
        this.out = out;
        this.ch = null;
        this.cs = enc.charset();
        this.encoder = enc;
        // This path disabled until direct buffers are faster
        if (false && out instanceof FileOutputStream) {
                ch = ((FileOutputStream)out).getChannel();
        if (ch != null)
                    bb = ByteBuffer.allocateDirect(DEFAULT_BYTE_BUFFER_SIZE);
            if (ch == null) {
        bb = ByteBuffer.allocate(DEFAULT_BYTE_BUFFER_SIZE);
发现构造器会为其赋值，所以再回到OutputStreamWriter，看看它的StreamEncoder类型成员变量se是怎么来的：
//FileWriter
    //本文程序用的是这个重载版本的FileWriter构造器
    public FileWriter(String fileName) throws IOException {
        super(new FileOutputStream(fileName));//调用FileWriter的父类OutputStreamWriter构造器
//OutputStreamWriter
    //根据上面，会调用到这个重载版本的FileWriter构造器
    public OutputStreamWriter(OutputStream out) {
        super(out);
        try {
            se = StreamEncoder.forOutputStreamWriter(out, this, (String)null);//关键。这里为se变量赋值
        } catch (UnsupportedEncodingException e) {
            throw new Error(e);
再次追踪到StreamEncoder的forOutputStreamWriter里：
//StreamEncoder
    public static StreamEncoder forOutputStreamWriter(OutputStream out,
                                                      Object lock,
                                                      String charsetName)//根据上面，这个参数为null
        throws UnsupportedEncodingException
        String csn = charsetName;
        if (csn == null)//会进入此分支
            csn = Charset.defaultCharset().name();
        try {
            if (Charset.isSupported(csn))
                return new StreamEncoder(out, lock, Charset.forName(csn));
        } catch (IllegalCharsetNameException x) { }
        throw new UnsupportedEncodingException (csn);
追踪到Charset的defaultCharset方法：
//Charset
    public static Charset defaultCharset() {
        if (defaultCharset == null) {
            synchronized (Charset.class) {
                String csn = AccessController.doPrivileged(
                    new GetPropertyAction("file.encoding"));
                Charset cs = lookup(csn);
                if (cs != null)
                    defaultCharset = cs;
                    defaultCharset = forName("UTF-8");
        return defaultCharset;
终于真相大白，原来写入文件编码时用到的字符集是"file.encoding"（它一般就设置为UTF-8），如果jvm不支持该字符集，则再使用"UTF-8"。
总结一下：
new PrintWriter( new BufferedWriter( new FileWriter("BasicFileOutput.out")))这句代码，外面的PrintWriter和BufferedWriter都只是为了装饰，为了拓展功能，它们只是在和程序的内存打交道。
而FileWriter则真正与文件打交道，它将每个char字符按照某个字符集的标准进行encode，然后将encode得到的字节写入到文件中。
多讲一下PrintWriter和DataOutputStream，它们要写入文件，就需要直接或间接地装饰到别的FileReader。要写入文件，就必须一个字节一个字节的存。
对于PrintWriter来说，它利用了字符集，因为字符集提供的映射关系就刚好是“字符<===>若干字节”；
对于DataOutputStream来说，它利用了映射关系“Java数据类型<===>Java数据类型在内存中的存储”。