天天看点

java实现浏览器下载文件,并解决兼容各浏览器的乱码与后缀问题

之前用java写了一个文件流输出文件的功能,测试细节功能的时候,发现了许多问题

一、火狐浏览器下载带中文名字的文件会乱码,其他浏览器不会

1、原因:找了下资料后发现,是火狐使用了RFC 2183协议。

文件名存在http header中的filename,Content-Disposition: attachment; filename=FILENAME,该filename参数可用于为浏览器下载资源的文件的名称提供建议。但是,RFC 2183中声明文件名只能使用US-ASCII字符,目前大多数流行的Web浏览器似乎允许非US-ASCII字符(由于缺乏标准)在编码方案和文件名的字符集规范上不同意。那么问题是,如果文件名“naïvefile”(没有引号和第三个字母是U + 00EF)需要编码到Content-Disposition头文件中。 

2、解决方案:因此只需要在Content-Disposition做下处理即可:重点是filename*=utf-8'zh_cn'

byte[] source = (byte[]) result.get("exportData");
HttpServletResponse response = RequestContextHolder.getRequestAttributes().getResponse();
if (response != null) {
    response.reset();
    response.setHeader("Content-Disposition", "attachment;filename*=utf-8'zh_cn'" + URLEncoder.encode("导出文件名.csv", "UTF-8"));
    response.setHeader("Connection", "close");
    response.setHeader("Content-Type", "application/octet-stream");
    OutputStream out = response.getOutputStream();
    out.write(source);
    out.flush();
    out.close();
}
           

二、ios系统safiri浏览器导出任何类型文件都变成dms后缀类型的文件

原因:header请求中,Content-Type设置application/octet-stream,而该type类型对应Mime 类型列表刚好是dms后缀的文件,所以safiri浏览器就直接将该文件生成为dms后缀的文件了

解决方案:大部分浏览器都可以使用通用类型application/octet-stream,但遇到特殊的如safiri,就最好是生成什么类型的文件,就传对应的tpye

//这里Content-Type传对应的文件类型,参考下面的MIME表
response.setHeader("Content-Type", "text/csv");
           

另附上Mime 类型列表:

扩展名 类型/子类型
application/octet-stream
323 text/h323
acx application/internet-property-stream
ai application/postscript
aif audio/x-aiff
aifc audio/x-aiff
aiff audio/x-aiff
asf video/x-ms-asf
asr video/x-ms-asf
asx video/x-ms-asf
au audio/basic
avi video/x-msvideo
axs application/olescript
bas text/plain
bcpio application/x-bcpio
bin application/octet-stream
bmp image/bmp
c text/plain
cat application/vnd.ms-pkiseccat
cdf application/x-cdf
cer application/x-x509-ca-cert
class application/octet-stream
clp application/x-msclip
cmx image/x-cmx
cod image/cis-cod
cpio application/x-cpio
crd application/x-mscardfile
crl application/pkix-crl
crt application/x-x509-ca-cert
csh application/x-csh
css text/css
dcr application/x-director
der application/x-x509-ca-cert
dir application/x-director
dll application/x-msdownload
dms application/octet-stream
doc application/msword
dot application/msword
dvi application/x-dvi
dxr application/x-director
eps application/postscript
etx text/x-setext
evy application/envoy
exe application/octet-stream
fif application/fractals
flr x-world/x-vrml
gif image/gif
gtar application/x-gtar
gz application/x-gzip
h text/plain
hdf application/x-hdf
hlp application/winhlp
hqx application/mac-binhex40
hta application/hta
htc text/x-component
htm text/html
html text/html
htt text/webviewhtml
ico image/x-icon
ief image/ief
iii application/x-iphone
ins application/x-internet-signup
isp application/x-internet-signup
jfif image/pipeg
jpe image/jpeg
jpeg image/jpeg
jpg image/jpeg
js application/x-javascript
latex application/x-latex
lha application/octet-stream
lsf video/x-la-asf
lsx video/x-la-asf
lzh application/octet-stream
m13 application/x-msmediaview
m14 application/x-msmediaview
m3u audio/x-mpegurl
man application/x-troff-man
mdb application/x-msaccess
me application/x-troff-me
mht message/rfc822
mhtml message/rfc822
mid audio/mid
mny application/x-msmoney
mov video/quicktime
movie video/x-sgi-movie
mp2 video/mpeg
mp3 audio/mpeg
mpa video/mpeg
mpe video/mpeg
mpeg video/mpeg
mpg video/mpeg
mpp application/vnd.ms-project
mpv2 video/mpeg
ms application/x-troff-ms
mvb application/x-msmediaview
nws message/rfc822
oda application/oda
p10 application/pkcs10
p12 application/x-pkcs12
p7b application/x-pkcs7-certificates
p7c application/x-pkcs7-mime
p7m application/x-pkcs7-mime
p7r application/x-pkcs7-certreqresp
p7s application/x-pkcs7-signature
pbm image/x-portable-bitmap
pdf application/pdf
pfx application/x-pkcs12
pgm image/x-portable-graymap
pko application/ynd.ms-pkipko
pma application/x-perfmon
pmc application/x-perfmon
pml application/x-perfmon
pmr application/x-perfmon
pmw application/x-perfmon
pnm image/x-portable-anymap
pot, application/vnd.ms-powerpoint
ppm image/x-portable-pixmap
pps application/vnd.ms-powerpoint
ppt application/vnd.ms-powerpoint
prf application/pics-rules
ps application/postscript
pub application/x-mspublisher
qt video/quicktime
ra audio/x-pn-realaudio
ram audio/x-pn-realaudio
ras image/x-cmu-raster
rgb image/x-rgb
rmi audio/mid
roff application/x-troff
rtf application/rtf
rtx text/richtext
scd application/x-msschedule
sct text/scriptlet
setpay application/set-payment-initiation
setreg application/set-registration-initiation
sh application/x-sh
shar application/x-shar
sit application/x-stuffit
snd audio/basic
spc application/x-pkcs7-certificates
spl application/futuresplash
src application/x-wais-source
sst application/vnd.ms-pkicertstore
stl application/vnd.ms-pkistl
stm text/html
svg image/svg+xml
sv4cpio application/x-sv4cpio
sv4crc application/x-sv4crc
swf application/x-shockwave-flash
t application/x-troff
tar application/x-tar
tcl application/x-tcl
tex application/x-tex
texi application/x-texinfo
texinfo application/x-texinfo
tgz application/x-compressed
tif image/tiff
tiff image/tiff
tr application/x-troff
trm application/x-msterminal
tsv text/tab-separated-values
txt text/plain
uls text/iuls
ustar application/x-ustar
vcf text/x-vcard
vrml x-world/x-vrml
wav audio/x-wav
wcm application/vnd.ms-works
wdb application/vnd.ms-works
wks application/vnd.ms-works
wmf application/x-msmetafile
wps application/vnd.ms-works
wri application/x-mswrite
wrl x-world/x-vrml
wrz x-world/x-vrml
xaf x-world/x-vrml
xbm image/x-xbitmap
xla application/vnd.ms-excel
xlc application/vnd.ms-excel
xlm application/vnd.ms-excel
xls application/vnd.ms-excel
xlt application/vnd.ms-excel
xlw application/vnd.ms-excel
xof x-world/x-vrml
xpm image/x-xpixmap
xwd image/x-xwindowdump
z application/x-compress
zip application/zip
csv text/csv

三、用GBK编码还是UTF-8?

上面的问题其实还有一个原因,就是中文系统默认编码是GBK的,所以当时生成文件流的时候设置了GBK,但导出的时候却用了UTF-8,所以乱码了。必须使用统一的编码。

ByteArrayOutputStream baos = new ByteArrayOutputStream();
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(baos, "GBK"))
...
byte[] source = baos.toByteArray();
baos.close();
return source;

byte[] source = (byte[]) result.get("exportData");
HttpServletResponse response = DataUtils.getResponse();
if (response != null) {
    response.reset();
    //这里设置了utf-8,不一致
    response.setHeader("Content-Disposition", "attachment;filename*=utf-8'zh_cn'"
                        + URLEncoder.encode(result.get("subject") + ".csv", "UTF-8").replace("+", "%20"));
    response.setHeader("Connection", "close");
    response.setHeader("Content-Type", "text/csv");
    OutputStream out = response.getOutputStream();
    out.write(source);
    out.flush();
    out.close();
}
           

·1、解析:

GBK是在国家标准GB2312基础上扩容后兼容GB2312的标准(好像还不是国家标准)。GBK编码专门用来解决中文编码的,是双字节的。不论中英文都是双字节的。

UTF-8 编码是用以解决国际上字符的一种多字节编码,它对英文使用8位(即一个字节),中文使用24位(三个字节)来编码。对于英文字符较多的论坛则用UTF-8 节省空间。另外,如果是外国人访问你的GBK网页,需要下载中文语言包支持。访问UTF-8编码的网页则不出现这问题。可以直接访问。

GBK包含全部中文字符;UTF-8则包含全世界所有国家需要用到的字符。

经常有人问网页编写UTF-8和GBK哪个编码好,根据个人需要,如果你主要做中文程序的开发,客户也主要是中国人的话就用GBK吧,因为UTF-8编码的中文使用了三个字节,用GBK节省了空间。

如果做英文网站开发,还是用utf-8吧,因为utf-8中英文只占一个字节。GBK中英文也是两个字节的,并且国外客户访问GBK要下载语言包。

如果你的网站是中文的,但国外用户也不少,最好也用UTF-8的吧。

UTF-8编码的文字可以在各国各种支持UTF8字符集的浏览器上显示。

比如,如果是UTF8编码,则在外国人的英文IE上也能显示中文,而无需他们下载IE的中文语言支持包。 所以,对于英文比较多的论坛 ,使用GBK则每个字符占用2个字节,而使用UTF-8英文却只占一个字节。

UTF8是国际编码,它的通用性比较好,外国人也可以浏览论坛,GBK是国家编码,通用性比UTF8差,不过UTF8占用的数据库比GBK大。

继续阅读