VBScript Scripting Techniques > Files > File Encoding
ADODB.Stream | |
---|---|
VBScript Code: | |
' Based on information from ' https://en.wikipedia.org/wiki/Byte_order_mark ' 'Encoding Hex BOM '======== ======= 'BOCU-1 FB EE 28 'GB-18030 84 31 95 33 'SCSU 0E FE FF 'UTF-1 F7 64 4C 'UTF-7 2B 2F 76 (38|39|2B|2F) 'UTF-8 EF BB BF 'UTF-16 (BE) FE FF 'UTF-16 (LE) FF FE 'UTF-32 (BE) 00 00 FE FF 'UTF-32 (LE) FF FE 00 00 'UTF-EBCDIC DD 73 66 73 Option Explicit Const adTypeBinary = 1 Const adTypeText = 2 Dim i, intRC Dim dicBOMs, objFSO, objStream Dim strBOM, strFile, strHead, strType, strUTF7 If WScript.Arguments.Unnamed.Count <> 1 Then Syntax If WScript.Arguments.Named.Count > 0 Then Syntax intRC = 0 strFile = WScript.Arguments.Unnamed(0) strType = "Unknown" strUTF7 = "38;39;2B;2F" ' Allowed values for 4th byte of UTF-7 BOM Set objFSO = CreateObject( "Scripting.FileSystemObject" ) If Not objFSO.FileExists( strFile ) Then Syntax Set objFSO = Nothing Set dicBOMs = CreateObject( "Scripting.Dictionary" ) dicBOMs.Add "0000FEFF", "UTF-32 (BE)" dicBOMs.Add "0EFEFF", "SCSU" dicBOMs.Add "2B2F76", "UTF-7" ' First 3 bytes of BOM only, 4th byte can have several values dicBOMs.Add "84319533", "GB-18030" dicBOMs.Add "DD736673", "UTF-EBCDIC" dicBOMs.Add "EFBBBF", "UTF-8" dicBOMs.Add "F7644C", "UTF-1" dicBOMs.Add "FBEE28", "BOCU-1" dicBOMs.Add "FEFF", "UTF-16 (BE)" dicBOMs.Add "FFFE", "UTF-16 (LE)" dicBOMs.Add "FFFE0000", "UTF-32 (LE)" On Error Resume Next Set objStream = CreateObject( "ADODB.Stream" ) objStream.Open objStream.Type = adTypeBinary objStream.LoadFromFile strFile If Err Then intRC = 1 objStream.Position = 0 strHead = "" For i = 0 To 3 strHead = strHead & UCase( Right( "0" & Hex( AscB( objStream.Read( 1 ) ) ), 2 ) ) If Err Then intRC = 1 Next objStream.Close Set objStream = Nothing On Error Goto 0 If intRC = 1 Then Syntax For i = 8 To 4 Step -2 ' Try the longest match (4 bytes) first, next try 3 bytes, finally try 2 bytes If strType = "Unknown" Then strBOM = Left( strHead, i ) If dicBOMs.Exists( strBOM ) Then If dicBOMs( strBOM ) = "UTF-7" Then If InStr( strUTF7, Right( strHead, 2 ) ) Then strType = "UTF-7" Else strType = dicBOMs( strBOM ) End If End If End If Next If strType = "Unknown" Then intRC = 1 WScript.Echo "File Name : " & strFile & vbcrlf _ & "First 4 bytes : " & strHead & vbcrlf _ & "Matching BOM : " & strBOM & vbcrlf _ & "File Encoding : " & strType WScript.Quit intRC Sub Syntax Dim strMsg strMsg = vbCrLf _ & "CheckBOM.vbs, Version 1.00" _ & vbCrLf _ & "Check a file's Byte Order Mark (BOM) to determine its text encoding" _ & vbCrLf & vbCrLf _ & "Usage: CheckBOM.vbs textfilename" _ & vbCrLf & vbCrLf _ & "Note: The file encoding is displayed on screen, e.g. ""UTF-7"" or" _ & vbCrLf _ & " ""UTF-32 (LE)"", or ""Unknown"" if not recognized." _ & vbCrLf _ & " Check this script's source code for a list of recognized BOMs." _ & vbCrLf & vbCrLf _ & "Written by Rob van der Woude" _ & vbCrLf _ & "http://www.robvanderwoude.com" WScript.Echo strMsg WScript.Quit 1 End Sub |
|
Requirements: | |
Windows version: | 2000 and later |
Network: | any |
Client software: | MDAC 2.8 for Windows 2000 |
Script Engine: | any |
Summarized: | Works in Windows 2000 or later, Windows 2000 requires MDAC 2.8. |
[Back to the top of this page] |
page last modified: 2017-07-26; loaded in 0.0019 seconds