Bat, ps1, vbs, js script runtime garbled error? It's actually a text encoding issue!

Bat, PS1, VBS, JS Scripts Throwing Errors or Garbled Text? It’s All About Text Encoding!

Script Type	Runtime Environment	Recommended Encoding	Notes
`.js`	Node.js	UTF-8 (No BOM)	Modern web development standard; never use BOM.
`.js`	WScript (double-click)	ANSI (GBK)	Windows host environment has poor UTF-8 support; use ANSI for proper `WScript.Echo` with Chinese characters.
`.bat`	CMD Command Line	ANSI (GBK)	By default, CMD on Chinese Windows only understands ANSI. Saving as UTF-8 will cause errors or garbled text.
`.ps1`	PowerShell 5.1	UTF-8 with BOM	Older PowerShell (the blue-background version bundled with Windows) needs BOM to recognize UTF-8.
`.ps1`	PowerShell 7+	UTF-8 (No BOM)	Microsoft’s newer cross-platform PS (black background) defaults to and recommends standard UTF-8.
`.vbs`	WScript / CScript	ANSI (GBK) or UTF-16 LE	VBS is a very old scripting language; ANSI is the safest bet. In Windows Notepad, choose “ANSI” or “UTF-16 LE with BOM” (called Unicode). Never use UTF-8.

In Windows, ANSI is not a specific character encoding—it’s a chameleon.

Simply put, when you choose “ANSI” in Windows Notepad, it actually means: “Use the current system’s default locale encoding.”

What Does ANSI Actually Equal? It Depends on Your Country!

Because ANSI follows the system language, a file labeled “ANSI” will use completely different underlying encodings on computers in different countries (Windows calls these “Code Pages”):

On Simplified Chinese Windows: ANSI = GBK (Code Page 936)
On Traditional Chinese Windows: ANSI = Big5 (Code Page 950)
On Japanese Windows: ANSI = Shift-JIS (Code Page 932)
On US/English Windows: ANSI = Windows-1252 (Code Page 1252)

Why Does English Never Get Garbled, But Chinese Does?

This is the root cause of the “garbled text” phenomenon:

Why English doesn’t get garbled: Whether it’s GBK, Big5, or Windows-1252, they all extend from the basic ASCII character set. The first 128 characters (English letters, numbers, punctuation) have identical binary data. So English works fine regardless of the encoding.
When Chinese gets garbled (cross-country/cross-system): Suppose you create a text file on Chinese Windows, write “你好”, and save it as ANSI (the system silently stores it as GBK bytes). Then you send this file to a friend using English Windows. When they double-click to open it, the system sees “ANSI” and tries to interpret those bytes using Windows-1252 (English system’s locale encoding). The result? Chinese characters get decoded into strange European letters or gibberish (like ÄãºÃ).

Why Did Microsoft Call It ANSI?

ANSI stands for the American National Standards Institute.

In the early days of Windows (Windows 3.1 and Windows 95 era), Microsoft introduced different code pages to support multiple languages. At that time, the default English code page was Windows-1252, which was loosely based on a draft standard from the ANSI organization. Microsoft took the easy route and uniformly named the “system default locale encoding” option as “ANSI” throughout the system.

What Is BOM?

BOM is Microsoft’s “proprietary marker” forcibly attached to text files.

In modern web development and programming (including your Hexo blog, Node.js, JS, frontend code), BOM is purely harmful with zero benefits.

What Was BOM Originally For?

At the computer’s lowest level, data is stored in “bytes.” Encodings like UTF-16 or UTF-32 use 2 or 4 bytes per character. This raises a question: Should the high-order byte or low-order byte come first in memory?

Big-Endian: The high-order byte comes first (like a person standing upright).
Little-Endian: The low-order byte comes first (like a person doing a handstand).

To tell the computer whether the file is “standing” or “handstanding,” the Unicode standard specifies placing a special invisible character (code U+FEFF) at the very beginning of the file. When the computer reads this character, it knows how to parse the subsequent bytes. This invisible character is the BOM.

Why Is BOM a “Cancer” in UTF-8?

Please note: UTF-8 does NOT need BOM!

UTF-8’s design is clever—its byte order is fixed (it reads 1 to 4 bytes sequentially), so there’s no “big-endian” or “little-endian” issue. According to international standards, UTF-8 files should NOT have BOM.

So why does “UTF-8 with BOM” exist? Blame Microsoft again.

In the early days, Windows’ default encoding was ANSI. When Windows’ built-in “Notepad” opened a file, it couldn’t tell whether it was ANSI or UTF-8. To take a shortcut, Microsoft decided:

Any UTF-8 file saved with Notepad will have 3 bytes of BOM (hexadecimal EF BB BF) inserted at the beginning. When opened again, if Notepad sees these 3 bytes, it knows it’s UTF-8.

How Does BOM Crash Your Code?

These 3 bytes (EF BB BF) are invisible (technically called a “zero-width no-break space”). You can’t see them in your editor, but computer programs reading the file will get tripped up.

Modern programming environments (Linux, Mac, Node.js, web browsers) assume all files are standard UTF-8. They don’t recognize BOM as a marker and treat it as regular text content.

This leads to many bizarre bugs, such as JS script errors: If your .js file has BOM, when Node.js tries to run its first line of code, it encounters unrecognizable garbage characters at the beginning, immediately throwing a SyntaxError, and your code refuses to run.

Bat, ps1, vbs, js script runtime garbled error? It's actually a text encoding issue!

https://en.lvlele.top/092-bat-ps1-vbs-js-encoding-error/

Author

Lvlele 吕了了

Posted on

June 27, 2026

Licensed under