When can Minecraft parse a string without quotes?

๐ŸŽ™๏ธ tryashtar ยท 2 points ยท Posted at 04:51:26 on February 20, 2016 ยท (Permalink)


As we all probably know, many string NBT tags don't need to be wrapped in quotes -- doing something like CustomName:Blah works fine.

However, sometimes you need quotes. If I want a creature to be named "Steve", I would need to do CustomName:"\"Steve\"". However, if I wanted the creature to be named An "escaped" string, I don't have to wrap the whole thing in quotes: CustomName:An "escaped" string works just fine. Conversely, I can't name the monster Rand"om quot"ation"s using the same principle.

Quotes aren't the only thing that throw the system off -- commas do as well. CustomName:a,b will not parse at all. I would need to use CustomName:"a,b".

My question is: what are the rules? Basically, given any string, how can I determine whether I need to wrap it in quotes for Minecraft to parse it correctly?

Thank you!

Skylinerw ยท 5 points ยท Posted at 07:11:44 on February 20, 2016 ยท (Permalink)*

For the sake of simplicity, there are two steps, in order: syntax parsing and data parsing.

The parser first ensures that the data is syntactically correct, which is why Rand"om quot"ation"s won't work: unbalanced quotation marks. It doesn't know whether or not it's a string at that point, but it does know that you have 3 quotes when it's expecting an even number. Escaped quotes do not count towards that number, but escaped quotes can only be used if a normal quotation is currently open. The actual data is parsed after this step.

For data parsing, a string is only used when no other datatype matched or when numerical datatypes overflow (which includes an IntArray).

As such, a quotation mark does not necessarily mean it's a string, but since they're not used for any other datatype, it ensures it will be a string. However, quotation marks are best practice because they have a specific feature, as you've observed: they allow the usage of commas, brackets, and quotes as the value without causing unbalancing errors, provided the characters are between quotation marks (checked during syntax parsing). Example where usage of unbalanced brackets is skirted (though the quotes will be used as the literal value):

CustomName:he"{"llo

The following would create strings because in the first step they are syntactically correct, and in the second step they are not parsed as other datatypes (excluding overflow for now, will cover later):

CustomName:hello
CustomName:he"ll"o
CustomName:"hello"
CustomName:h"ello"
CustomName:te","st

The reason CustomName:"hell"o does not work is because the syntax parser is expecting a new tag to be opened or remaining tags to be closed after a 'closing' quotation mark (specifically when a quotation mark is used at the beginning of the tag value). This is happening during the data parsing phase.

Note that as far as tag separation goes, the first colon separates the tag name with the entirety of the tag value, so any subsequent colons will be the value (e.g. CustomName:::::::: is fine). It's a common misconception that you need quotes around colons; the cases you absolutely need quotes is for the following (syntax parsing):

CustomName:"\"Quotes as the first and last value.\""
CustomName:"comma, as the value."
CustomName:"Otherwise-unbalanced \" quotation marks."
CustomName:"Otherwise-unbalanced { [ curly/square brackets."
CustomName:"Otherwise-unbalanced } ] curly/square brackets."

(data parsing):

CustomName:"{ensuring no other datatype is declared.}"
CustomName:"1"
CustomName:"1b, and so on for all datatypes"

When a string is parsed through non-overflow means, the quotation marks around it will be stripped but only if there exists quotes as the first and last character. This is why CustomName:"hello" does not have quotes in the actual value and why CustomName:he"ll"o does. As well, CustomName:he"llo" will also use quotes as the literal value because there is no quotation mark at the start of the value.

As far as string declaration from overflow goes, it's pretty straight-forward: you'd declare a specific datatype first, but provide it with a value that is out of range. For example, a byte tag has a range from -128 to 127. Giving it a value of 300 is too high, and so the tag becomes a string of that literal input instead:

CustomName:300b

Essentially means:

CustomName:"300b"

For an IntArray, if any of the records holds an integer that overflows, the entire input becomes a string:

CustomName:[1,2,2147483648]

Essentially means:

CustomName:"[1,2,2147483648]"
pau101 ยท 2 points ยท Posted at 07:59:05 on February 20, 2016 ยท (Permalink)

To go along with specifying data types the "|" character following a number will result in the value being parsed as a double, which would normally be done with a "d", "D", or having a decimal without a data type. This is the result of a technical error this is present for each of the typed tag regular expressions.

Skylinerw ยท 2 points ยท Posted at 08:17:17 on February 20, 2016 ยท (Permalink)

That's correct!

The parser uses the following regex:

[-+]?[0-9]*\\.?[0-9]+[d|D]

| is not meant to be used as OR while within a character class, so is instead used as a value check. The IntArray type also uses it in that manner (pattern \\[[-+\\d|,\\s]+\\]), so the following tries to create an IntArray instead of a List (finally resulting in a String due to non-parseable Integer):

[1,2,|,4]

I'll create a bug report for that (even though it's fairly minor).

๐ŸŽ™๏ธ tryashtar ยท 1 points ยท Posted at 23:22:33 on February 20, 2016 ยท (Permalink)

Wow! Thank you so much for the in-depth answer! This is exactly what I was looking for!

MrPingouin1 ยท 3 points ยท Posted at 07:18:33 on February 20, 2016 ยท (Permalink)*

There are a lot of different rules, and you kinda have to understand how the NBT parser is working.

 

1) Matching data type :

The value of the CustomName tag is supposed to be a String, but when minecraft try to parse a command, he don't know that yet so he kinda guess it. String is like the default type, if the string can not fit into other type, it will be a tag String. The problem is that if that string can be parsed as an Integer (or an other type), the game will create a CustomName NOT of type string, and when the game will actually need to use this tag, he will ask "Can I get a tag String called CustomName?" and the answer will be NO. Any of the following command will NOT set a CustomName of type string to the entity :

/summon Sheep ~ ~ ~ {CustomName:1}
/summon Sheep ~ ~ ~ {CustomName:1b}
/summon Sheep ~ ~ ~ {CustomName:1.0}
/summon Sheep ~ ~ ~ {CustomName:111111111111111L}
/summon Sheep ~ ~ ~ {CustomName:{CustomName:Sheep,Tags:["red"]}}
/summon Sheep ~ ~ ~ {CustomName:[{},{},{empty:true}]}
/summon Sheep ~ ~ ~ {CustomName:[1,2,3,4]}

The parser has a special regex to avoid confusing between List and Int array. The parser will only try to parse to a list or compound if the value is unquoted (the first character only), and it begin with [ or {. If the parsing of those fail, they will throw an error instead of creating a tag String.

 /summon Sheep ~ ~ ~ {CustomName:111111111111111}

This command will work because 111111111111111 looks like a number, but it can't parsed to an Integer, so that will be String.

 

2) Correct syntax { [ " nesting :

When the parser reach the CustomName tag (he still doesn't know what is the type of this tag yet), he will try to get his value, and since he knows that he is in an tag compound, he will read each following character one by one, until he reach either a comma, or a closing curly bracket. According to that, the parser should not work with things like this : {CustomName:abc[,]}, but it does because the parser is smarter than that. Every time he encounters a bracket ("{" or "["), he stops looking at the comma or the closing bracket, but will instead wait for the corresponding closing bracket. He can even do that on higher nested level. If he encounters an unexpected closing bracket, he will throw an error. On that step, everything other than { [ ] } " \ , will be ignored. Here are some working and then not working examples :

/summon Sheep ~ ~ ~ {CustomName:abc{}}
/summon Sheep ~ ~ ~ {CustomName:a{}[]}
/summon Sheep ~ ~ ~ {CustomName:a{[]}}
/summon Sheep ~ ~ ~ {CustomName:a{[,]}}
/summon Sheep ~ ~ ~ {CustomName:a{{}[]{}{[]}}}
Not working :
/summon Sheep ~ ~ ~ {CustomName:a{}
/summon Sheep ~ ~ ~ {CustomName:a]}
/summon Sheep ~ ~ ~ {CustomName:a[{]}}

 

3) Quote Escape There are only two simple rule on this one : "The character after \ is escaped" and "You can't escape a quote unless you are into a quote". When one of the character is a quote, the first thing to do is to check whether this quote is escaped. That might sound really easy, but things like \\\\\\\\\\\\\\\\\\\" are allowed; it's still not hard but you still have to count all the \. If the quote is indeed escaped and you are not in a quote (as stated in rule 2), an error will be throw, otherwise open a new quote. On the other hand, if the quote is not escaped, either close the current quote, or open a new one. Once a quote is open, you can forget all about { [ or , because the only thing needed to close the quote is an other not escaped quote. Based on that here are some examples :

Working
/summon Sheep ~ ~ ~ {CustomName:""}
/summon Sheep ~ ~ ~ {CustomName:abc"def"ghi}
/summon Sheep ~ ~ ~ {CustomName:a""""""}
/summon Sheep ~ ~ ~ {CustomName:a\\}
/summon Sheep ~ ~ ~ {CustomName:a\a\a\a\}
Not working :
/summon Sheep ~ ~ ~ {CustomName:a"b}
/summon Sheep ~ ~ ~ {CustomName:a\"b}
/summon Sheep ~ ~ ~ {CustomName:a\\\""}
/summon Sheep ~ ~ ~ {CustomName:"ab"c}
/summon Sheep ~ ~ ~ {CustomName:a\]}

If the string begin with a quote, the parser expect a quote right at the end, and those quotes will be removed. The escape only works on \ and ", and unlike many programming language, \a will not escape the a, and thus, it will be interpreted as \a

๐ŸŽ™๏ธ tryashtar ยท 1 points ยท Posted at 23:24:48 on February 20, 2016 ยท (Permalink)

Thank you very much! This is great information, I appreciate your detailed answer.

brianmcn ยท 2 points ยท Posted at 04:55:43 on February 20, 2016 ยท (Permalink)

I think you have most of the rules: does it contain a quotation mark or a comma? I bet that a right curly brace, perhaps a left one, and possibly square brackets may be the only other characters? Not sure about apostrophe. I don't know the actual rules, but I bet you can figure them out with minimal experimentation.

๐ŸŽ™๏ธ tryashtar ยท 1 points ยท Posted at 05:01:51 on February 20, 2016 ยท (Permalink)

Simply containing those characters doesn't mean it needs to be wrapped, though.

I tested it out, and you were right about brackets, but again only sometimes. For example, it would parse Command:abc[]abc just fine, but not Command:abc{].

brianmcn ยท 2 points ยท Posted at 05:39:01 on February 20, 2016 ยท (Permalink)

(But if you're just trying to use the fewest characters for some one-command tool or whatnot, I think a slightly conservative approach that occasionally adds needless quotes may be 'good enough' for most applications.)

brianmcn ยท 1 points ยท Posted at 05:09:19 on February 20, 2016 ยท (Permalink)

You're right, it's not so simple; "CustomName:abc{}" parses the name "abc{}", for example. In that case, I expect it's a nightmare to suss out the details, and I'll pretend I never saw this question :)

brianmcn ยท 3 points ยท Posted at 15:31:31 on February 20, 2016 ยท (Permalink)

(The rest of the new comments on this thread demonstrate that I'm right, it's a nightmare.)

๐ŸŽ™๏ธ tryashtar ยท 1 points ยท Posted at 05:38:20 on February 20, 2016 ยท (Permalink)

Yep, I guess it's time to delve into undocumented behavior! ;)

Hopefully I can work it out.

pau101 ยท 1 points ยท Posted at 06:32:04 on February 20, 2016 ยท (Permalink)*

For normal json which isn't parsed as NBT, such as tellraw:

A string must be wrapped in quotes if it contains any of the following 16 characters, (the escaped characters refer to their actual value which can't be input normally in-game):

/ \ ; #
= { } [
] : ,
\t \f \r \n

If you wish to use double quotes in a string you can use single quotes to quote the string instead so you don't need to escape the double quotes, e.g. CustomName:'my "quotes" here'

Skylinerw ยท 2 points ยท Posted at 07:13:39 on February 20, 2016 ยท (Permalink)

Just have to be extra explicit that NBT and JSON are not the same thing and have extremely different parsing rules. CustomName as NBT data will not accept single quotes as string declaration and will instead be used as the literal value.

And as of 1.9, JSON data must be strict (thus single quotes cannot be used to instantiate a string anymore as double quotes are required).