写点什么

C#实现自己的 Json 解析器 (LALR(1)+miniDFA)

  • 2025-03-20
    福建
  • 本文字数:61108 字

    阅读完需:约 200 分钟

C#实现自己的 Json 解析器(LALR(1)+miniDFA)


Json 是一个用处广泛、文法简单的数据格式。本文介绍如何用 bitParser(拥有自己的解析器(C#实现 LALR(1)语法解析器和 miniDFA 词法分析器的生成器)迅速实现一个简单高效的 Json 解析器。

读者可在(https://gitee.com/bitzhuwei/bitParser-demos/tree/master/bitzhuwei.JsonFormat.TestConsole)查看、下载完整代码。


Json 格式的文法


我们可以在(https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf )找到 Json 格式的详细说明。据此,可得如下文法:


// Json grammar according to ECMA-404 2nd Edition / December 2017Json = Object | Array ;Object = '{' '}' | '{' Members '}' ;Array = '[' ']' | '[' Elements ']' ;Members = Members ',' Member | Member ;Elements = Elements ',' Element | Element ;Member = 'string' ':' Value ;Element = Value ;Value = 'null' | 'true' | 'false' | 'number' | 'string'      | Object | Array ;
%%"([^"\\\u0000-\u001F]|\\["\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"%% 'string'%%[-]?(0|[1-9][0-9]*)([.][0-9]+)?([eE][+-]?[0-9]+)?%% 'number'
复制代码


实际上这个文法是我用 AI 写出来后再整理成的。


此文法说明:

  1. 一个Json要么是一个Object,要么是一个Array

  2. 一个Object包含 0-多个键值对("key" : value),用{ }括起来。

  3. 一个Array包含 0-多个value,用[ ]括起来。

  4. 一个value有如下几种类型:nulltruefalsenumberstringObjectArray


其中:


nulltruefalse就是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:


%%null%% 'null'%%true%% 'true'%%false%% 'false'
复制代码


{}[],:也都是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:


%%\{%% '{'%%}%% '}'%%\[%% '['%%]%% ']'%%,%% ','%%:%% ':'
复制代码


number可由下图描述:



图上直观地说明了number这个 token 的正则表达式由 4 个依次排列的部分组成:


[-]?  (0|[1-9][0-9]*)  ([.][0-9]+)?  ([eE][+-]?[0-9]+)?
复制代码

string可由下图描述:



图上直观地说明了string这个 token 的正则表达式是用"包裹起来的某些字符或转义字符:


" (  [^"\\\u0000-\u001F]  |  \\["\\/bfnrt]  |  \\u[0-9A-Fa-f]{4}  )*  "/*实际含义为:非"、非\、非控制字符(\u0000-\u001F)\"、\\、\/、\b、\f、\n、\r、\t\uNNNN*/
复制代码


Value = Object | Array;说明 Json 中的数据是可以嵌套的。


将此文法作为输入,提供给 bitParser,就可以一键生成下述章节介绍的 Json 解析器代码和文档了。


生成的词法分析器



DFA



DFA 文件夹下是依据确定的有限自动机原理生成的词法分析器的全部词法状态。


初始状态 lexicalState0


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { private static readonly Action<LexicalContext, char, CurrentStateWrap> lexicalState0 = static (context, c, wrap) => { if (false) { /* for simpler code generation purpose. */ } /* user-input condition code */ /* [1-9] */ else if (/* possible Vt : 'number' */ /* no possible signal */ /* [xxx] scope */ '1'/*'\u0031'(49)*/ <= c && c <= '9'/*'\u0039'(57)*/) { BeginToken(context); ExtendToken(context, st.@number); wrap.currentState = lexicalState1; } /* user-input condition code */ /* 0 */ else if (/* possible Vt : 'number' */ /* no possible signal */ /* single char */ c == '0'/*'\u0030'(48)*/) { BeginToken(context); ExtendToken(context, st.@number); wrap.currentState = lexicalState2; } /* user-input condition code */ /* [-] */ else if (/* possible Vt : 'number' */ /* no possible signal */ /* [xxx] scope */ c == '-'/*'\u002D'(45)*/) { BeginToken(context); wrap.currentState = lexicalState3; } /* user-input condition code */ /* " */ else if (/* possible Vt : 'string' */ /* no possible signal */ /* single char */ c == '"'/*'\u0022'(34)*/) { BeginToken(context); wrap.currentState = lexicalState4; } /* user-input condition code */ /* f */ else if (/* possible Vt : 'false' */ /* no possible signal */ /* single char */ c == 'f'/*'\u0066'(102)*/) { BeginToken(context); wrap.currentState = lexicalState5; } /* user-input condition code */ /* t */ else if (/* possible Vt : 'true' */ /* no possible signal */ /* single char */ c == 't'/*'\u0074'(116)*/) { BeginToken(context); wrap.currentState = lexicalState6; } /* user-input condition code */ /* n */ else if (/* possible Vt : 'null' */ /* no possible signal */ /* single char */ c == 'n'/*'\u006E'(110)*/) { BeginToken(context); wrap.currentState = lexicalState7; } /* user-input condition code */ /* : */ else if (/* possible Vt : ':' */ /* no possible signal */ /* single char */ c == ':'/*'\u003A'(58)*/) { BeginToken(context); ExtendToken(context, st.@Colon符); wrap.currentState = lexicalState8; } /* user-input condition code */ /* , */ else if (/* possible Vt : ',' */ /* no possible signal */ /* single char */ c == ','/*'\u002C'(44)*/) { BeginToken(context); ExtendToken(context, st.@Comma符); wrap.currentState = lexicalState9; } /* user-input condition code */ /* ] */ else if (/* possible Vt : ']' */ /* no possible signal */ /* single char */ c == ']'/*'\u005D'(93)*/) { BeginToken(context); ExtendToken(context, st.@RightBracket符); wrap.currentState = lexicalState10; } /* user-input condition code */ /* \[ */ else if (/* possible Vt : '[' */ /* no possible signal */ /* single char */ c == '['/*'\u005B'(91)*/) { BeginToken(context); ExtendToken(context, st.@LeftBracket符); wrap.currentState = lexicalState11; } /* user-input condition code */ /* } */ else if (/* possible Vt : '}' */ /* no possible signal */ /* single char */ c == '}'/*'\u007D'(125)*/) { BeginToken(context); ExtendToken(context, st.@RightBrace符); wrap.currentState = lexicalState12; } /* user-input condition code */ /* \{ */ else if (/* possible Vt : '{' */ /* no possible signal */ /* single char */ c == '{'/*'\u007B'(123)*/) { BeginToken(context); ExtendToken(context, st.@LeftBrace符); wrap.currentState = lexicalState13; } /* deal with everything else. */ else if (c == ' ' || c == '\r' || c == '\n' || c == '\t' || c == '\0') { wrap.currentState = lexicalState0; // skip them. } else { // unexpected char. BeginToken(context); ExtendToken(context); AcceptToken(st.Error错, context); wrap.currentState = lexicalState0; } }; }}
复制代码


DFA 文件夹下的实现是最初的也是最直观的实现。它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。


miniDFA



miniDFA 文件夹下是依据 Hopcroft 算法得到的最小化的有限自动机的全部词法状态。它与 DFA 的区别仅在于词法状态数量可能减少了。


它是第二个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。


tableDFA



tableDFA 文件夹下是二维数组形式(ElseIf[][])的 miniDFA。它与 miniDFA 表示的内容相同,区别在于:miniDFA 用一个函数(Action<LexicalContext, char, CurrentStateWrap>)表示一个词法状态,而它用一个数组(ElseIf[])表示一个词法状态。这样可以减少内存占用。


二维数组形式的 miniDFA


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { private static readonly ElseIf[] omitChars = new ElseIf[] { new('\u0000'/*(0)*/, nextStateId: 0, Acts.None), new('\t'/*'\u0009'(9)*/, '\n'/*'\u000A'(10)*/, nextStateId: 0, Acts.None), new('\r'/*'\u000D'(13)*/, nextStateId: 0, Acts.None), new(' '/*'\u0020'(32)*/, nextStateId: 0, Acts.None),
}; private static readonly ElseIf[][] lexiStates = new ElseIf[47][]; static void InitializeLexiTable() { ElseIf segment_48_48_25_3_ints_number = new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number);//refered 2 times ElseIf segment_49_57_24_3_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number);//refered 2 times ElseIf segment_48_57_37_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number);//refered 3 times ElseIf segment_48_57_38_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number);//refered 2 times ElseIf segment_48_57_44_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number);//refered 3 times ElseIf segment_48_57_45_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number);//refered 2 times ElseIf segment_46_46_8_0 = new('.'/*'\u002E'(46)*/, 8, Acts.None);//refered 9 times ElseIf segment_48_48_33_2_ints_number = new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number);//refered 2 times ElseIf segment_49_57_32_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number);//refered 2 times ElseIf segment_69_69_7_0 = new('E'/*'\u0045'(69)*/, 7, Acts.None);//refered 11 times ElseIf segment_101_101_7_0 = new('e'/*'\u0065'(101)*/, 7, Acts.None);//refered 11 times ElseIf segment_0_65535_0_4_ints_number = new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number);//refered 13 times ElseIf segment_48_48_40_2_ints_number = new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number);//refered 3 times ElseIf segment_49_57_39_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number);//refered 3 times ElseIf segment_48_57_41_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number);//refered 2 times lexiStates[0] = new ElseIf[] { // possible Vt: 'string' /*0*/new('"'/*'\u0022'(34)*/, 2, Acts.Begin), // possible Vt: ',' /*1*/new(','/*'\u002C'(44)*/, 27, Acts.Begin | Acts.Extend, st.@Comma符), // possible Vt: 'number' /*2*/new('-'/*'\u002D'(45)*/, 1, Acts.Begin), // possible Vt: 'number' /*3*///new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number), /*3*/segment_48_48_25_3_ints_number, // possible Vt: 'number' /*4*///new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number), /*4*/segment_49_57_24_3_ints_number, // possible Vt: ':' /*5*/new(':'/*'\u003A'(58)*/, 26, Acts.Begin | Acts.Extend, st.@Colon符), // possible Vt: '[' /*6*/new('['/*'\u005B'(91)*/, 29, Acts.Begin | Acts.Extend, st.@LeftBracket符), // possible Vt: ']' /*7*/new(']'/*'\u005D'(93)*/, 28, Acts.Begin | Acts.Extend, st.@RightBracket符), // possible Vt: 'false' /*8*/new('f'/*'\u0066'(102)*/, 3, Acts.Begin), // possible Vt: 'null' /*9*/new('n'/*'\u006E'(110)*/, 5, Acts.Begin), // possible Vt: 'true' /*10*/new('t'/*'\u0074'(116)*/, 4, Acts.Begin), // possible Vt: '{' /*11*/new('{'/*'\u007B'(123)*/, 31, Acts.Begin | Acts.Extend, st.@LeftBrace符), // possible Vt: '}' /*12*/new('}'/*'\u007D'(125)*/, 30, Acts.Begin | Acts.Extend, st.@RightBrace符), }; lexiStates[1] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number), segment_48_48_25_3_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number), segment_49_57_24_3_ints_number, }; lexiStates[2] = new ElseIf[] { // possible Vt: 'string' new(' '/*'\u0020'(32)*/, '!'/*'\u0021'(33)*/, 2, Acts.None), // possible Vt: 'string' new('"'/*'\u0022'(34)*/, 36, Acts.Extend, st.@string), // possible Vt: 'string' new('#'/*'\u0023'(35)*/, '['/*'\u005B'(91)*/, 2, Acts.None), // possible Vt: 'string' new('\\'/*'\u005C'(92)*/, 9, Acts.None), // possible Vt: 'string' new(']'/*'\u005D'(93)*/, '\uFFFF'/*�(65535)*/, 2, Acts.None), }; lexiStates[3] = new ElseIf[] { // possible Vt: 'false' new('a'/*'\u0061'(97)*/, 10, Acts.None), }; lexiStates[4] = new ElseIf[] { // possible Vt: 'true' new('r'/*'\u0072'(114)*/, 6, Acts.None), }; lexiStates[5] = new ElseIf[] { // possible Vt: 'null' new('u'/*'\u0075'(117)*/, 11, Acts.None), }; lexiStates[6] = new ElseIf[] { // possible Vt: 'true' new('u'/*'\u0075'(117)*/, 18, Acts.None), }; lexiStates[7] = new ElseIf[] { // possible Vt: 'number' new('+'/*'\u002B'(43)*/, 12, Acts.None), // possible Vt: 'number' new('-'/*'\u002D'(45)*/, 12, Acts.None), // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number), segment_48_57_37_2_ints_number, }; lexiStates[8] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number), segment_48_57_38_2_ints_number, }; lexiStates[9] = new ElseIf[] { // possible Vt: 'string' new('"'/*'\u0022'(34)*/, 2, Acts.None), // possible Vt: 'string' new('/'/*'\u002F'(47)*/, 2, Acts.None), // possible Vt: 'string' new('\\'/*'\u005C'(92)*/, 2, Acts.None), // possible Vt: 'string' new('b'/*'\u0062'(98)*/, 2, Acts.None), // possible Vt: 'string' new('f'/*'\u0066'(102)*/, 2, Acts.None), // possible Vt: 'string' new('n'/*'\u006E'(110)*/, 2, Acts.None), // possible Vt: 'string' new('r'/*'\u0072'(114)*/, 2, Acts.None), // possible Vt: 'string' new('t'/*'\u0074'(116)*/, 2, Acts.None), // possible Vt: 'string' new('u'/*'\u0075'(117)*/, 13, Acts.None), }; lexiStates[10] = new ElseIf[] { // possible Vt: 'false' new('l'/*'\u006C'(108)*/, 17, Acts.None), }; lexiStates[11] = new ElseIf[] { // possible Vt: 'null' new('l'/*'\u006C'(108)*/, 19, Acts.None), }; lexiStates[12] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number), segment_48_57_37_2_ints_number, }; lexiStates[13] = new ElseIf[] { // possible Vt: 'string' new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 14, Acts.None), // possible Vt: 'string' new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 14, Acts.None), // possible Vt: 'string' new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 14, Acts.None), }; lexiStates[14] = new ElseIf[] { // possible Vt: 'string' new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 15, Acts.None), // possible Vt: 'string' new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 15, Acts.None), // possible Vt: 'string' new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 15, Acts.None), }; lexiStates[15] = new ElseIf[] { // possible Vt: 'string' new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 16, Acts.None), // possible Vt: 'string' new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 16, Acts.None), // possible Vt: 'string' new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 16, Acts.None), }; lexiStates[16] = new ElseIf[] { // possible Vt: 'string' new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 2, Acts.None), // possible Vt: 'string' new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 2, Acts.None), // possible Vt: 'string' new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 2, Acts.None), }; lexiStates[17] = new ElseIf[] { // possible Vt: 'false' new('s'/*'\u0073'(115)*/, 22, Acts.None), }; lexiStates[18] = new ElseIf[] { // possible Vt: 'true' new('e'/*'\u0065'(101)*/, 42, Acts.Extend, st.@true), }; lexiStates[19] = new ElseIf[] { // possible Vt: 'null' new('l'/*'\u006C'(108)*/, 43, Acts.Extend, st.@null), }; lexiStates[20] = new ElseIf[] { // possible Vt: 'number' new('+'/*'\u002B'(43)*/, 23, Acts.None), // possible Vt: 'number' new('-'/*'\u002D'(45)*/, 23, Acts.None), // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number), segment_48_57_44_2_ints_number, }; lexiStates[21] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number), segment_48_57_45_2_ints_number, }; lexiStates[22] = new ElseIf[] { // possible Vt: 'false' new('e'/*'\u0065'(101)*/, 46, Acts.Extend, st.@false), }; lexiStates[23] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number), segment_48_57_44_2_ints_number, }; lexiStates[24] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number), segment_48_48_33_2_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number), segment_49_57_32_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[25] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' new('0'/*'\u0030'(48)*/, 35, Acts.Extend, st.@number), // possible Vt: 'number' new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 34, Acts.Extend, st.@number), // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[26] = new ElseIf[] { // possible Vt: ':' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Colon符), }; lexiStates[27] = new ElseIf[] { // possible Vt: ',' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Comma符), }; lexiStates[28] = new ElseIf[] { // possible Vt: ']' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBracket符), }; lexiStates[29] = new ElseIf[] { // possible Vt: '[' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBracket符), }; lexiStates[30] = new ElseIf[] { // possible Vt: '}' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBrace符), }; lexiStates[31] = new ElseIf[] { // possible Vt: '{' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBrace符), }; lexiStates[32] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number), segment_48_48_40_2_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number), segment_49_57_39_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[33] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number), segment_48_48_33_2_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number), segment_49_57_32_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[34] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number), segment_48_57_41_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[35] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[36] = new ElseIf[] { // possible Vt: 'string' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@string), }; lexiStates[37] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number), segment_48_57_37_2_ints_number, // possible Vt: 'number' new('E'/*'\u0045'(69)*/, 20, Acts.None), // possible Vt: 'number' new('e'/*'\u0065'(101)*/, 20, Acts.None), // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[38] = new ElseIf[] { // possible Vt: 'number' new('.'/*'\u002E'(46)*/, 21, Acts.None), // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number), segment_48_57_38_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[39] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number), segment_48_48_40_2_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number), segment_49_57_39_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[40] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number), segment_48_48_40_2_ints_number, // possible Vt: 'number' //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number), segment_49_57_39_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[41] = new ElseIf[] { // possible Vt: 'number' //new('.'/*'\u002E'(46)*/, 8, Acts.None), segment_46_46_8_0, // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number), segment_48_57_41_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[42] = new ElseIf[] { // possible Vt: 'true' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@true), }; lexiStates[43] = new ElseIf[] { // possible Vt: 'null' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@null), }; lexiStates[44] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number), segment_48_57_44_2_ints_number, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[45] = new ElseIf[] { // possible Vt: 'number' //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number), segment_48_57_45_2_ints_number, // possible Vt: 'number' //new('E'/*'\u0045'(69)*/, 7, Acts.None), segment_69_69_7_0, // possible Vt: 'number' //new('e'/*'\u0065'(101)*/, 7, Acts.None), segment_101_101_7_0, // possible Vt: 'number' //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number), segment_0_65535_0_4_ints_number, }; lexiStates[46] = new ElseIf[] { // possible Vt: 'false' new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@false), }; } }}
复制代码


它是第三个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。


Json.LexiTable.gen.bin



这是将二维数组形式(ElseIf[][])的 miniDFA 写入了一个二进制文件。加载 Json 解析器时,读取此文件即可得到二维数组形式(ElseIf[][])的 miniDFA。这就不需要将整个ElseIf[][]硬编码到源代码中了,从而进一步减少了内存占用。


为了方便调试、参考,我为其准备了对应的文本格式:


Json.LexiTable.gen.txt


ElseIf4 omit chars:0('\u0000'/*(0)*/->'\u0000'/*(0)*/)=>None,00('\t'/*'\u0009'(9)*/->'\n'/*'\u000A'(10)*/)=>None,00('\r'/*'\u000D'(13)*/->'\r'/*'\u000D'(13)*/)=>None,00(' '/*'\u0020'(32)*/->' '/*'\u0020'(32)*/)=>None,0
0 re-used int[] Vts:0 re-used IfVt ifVt:0 re-used IfVt[] ifVts:15 re-used ElseIf2 segment:25('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Begin, Extend,1124('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Begin, Extend,1137('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,1138('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,1144('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,1145('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,118('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,033('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,1132('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,117('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,07('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,00('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,1140('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,1139('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,1141('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,1147 ElseIf2[] row:LexiTable.Rows[0] has 13 segments:2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Begin,027(','/*'\u002C'(44)*/->','/*'\u002C'(44)*/)=>Begin, Extend,51('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>Begin,0-1-226(':'/*'\u003A'(58)*/->':'/*'\u003A'(58)*/)=>Begin, Extend,729('['/*'\u005B'(91)*/->'['/*'\u005B'(91)*/)=>Begin, Extend,328(']'/*'\u005D'(93)*/->']'/*'\u005D'(93)*/)=>Begin, Extend,43('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>Begin,05('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>Begin,04('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>Begin,031('{'/*'\u007B'(123)*/->'{'/*'\u007B'(123)*/)=>Begin, Extend,130('}'/*'\u007D'(125)*/->'}'/*'\u007D'(125)*/)=>Begin, Extend,2
LexiTable.Rows[1] has 2 segments:-1-2
LexiTable.Rows[2] has 5 segments:2(' '/*'\u0020'(32)*/->'!'/*'\u0021'(33)*/)=>None,036('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Extend,62('#'/*'\u0023'(35)*/->'['/*'\u005B'(91)*/)=>None,09('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,02(']'/*'\u005D'(93)*/->'\uFFFF'/*�(65535)*/)=>None,0
LexiTable.Rows[3] has 1 segments:10('a'/*'\u0061'(97)*/->'a'/*'\u0061'(97)*/)=>None,0
LexiTable.Rows[4] has 1 segments:6('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,0
LexiTable.Rows[5] has 1 segments:11('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[6] has 1 segments:18('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[7] has 3 segments:12('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,012('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0-3
LexiTable.Rows[8] has 1 segments:-4
LexiTable.Rows[9] has 9 segments:2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>None,02('/'/*'\u002F'(47)*/->'/'/*'\u002F'(47)*/)=>None,02('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,02('b'/*'\u0062'(98)*/->'b'/*'\u0062'(98)*/)=>None,02('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>None,02('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>None,02('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,02('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>None,013('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[10] has 1 segments:17('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0
LexiTable.Rows[11] has 1 segments:19('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0
LexiTable.Rows[12] has 1 segments:-3
LexiTable.Rows[13] has 3 segments:14('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,014('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,014('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[14] has 3 segments:15('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,015('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,015('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[15] has 3 segments:16('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,016('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,016('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[16] has 3 segments:2('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,02('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,02('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[17] has 1 segments:22('s'/*'\u0073'(115)*/->'s'/*'\u0073'(115)*/)=>None,0
LexiTable.Rows[18] has 1 segments:42('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,9
LexiTable.Rows[19] has 1 segments:43('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>Extend,8
LexiTable.Rows[20] has 3 segments:23('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,023('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0-5
LexiTable.Rows[21] has 1 segments:-6
LexiTable.Rows[22] has 1 segments:46('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,10
LexiTable.Rows[23] has 1 segments:-5
LexiTable.Rows[24] has 6 segments:-7-8-9-10-11-12
LexiTable.Rows[25] has 6 segments:-735('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,1134('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11-10-11-12
LexiTable.Rows[26] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,7
LexiTable.Rows[27] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,5
LexiTable.Rows[28] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,4
LexiTable.Rows[29] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,3
LexiTable.Rows[30] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,2
LexiTable.Rows[31] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,1
LexiTable.Rows[32] has 6 segments:-7-13-14-10-11-12
LexiTable.Rows[33] has 6 segments:-7-8-9-10-11-12
LexiTable.Rows[34] has 5 segments:-7-15-10-11-12
LexiTable.Rows[35] has 4 segments:-7-10-11-12
LexiTable.Rows[36] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,6
LexiTable.Rows[37] has 4 segments:-320('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,020('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,0-12
LexiTable.Rows[38] has 5 segments:21('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,0-4-10-11-12
LexiTable.Rows[39] has 6 segments:-7-13-14-10-11-12
LexiTable.Rows[40] has 6 segments:-7-13-14-10-11-12
LexiTable.Rows[41] has 5 segments:-7-15-10-11-12
LexiTable.Rows[42] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,9
LexiTable.Rows[43] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,8
LexiTable.Rows[44] has 2 segments:-5-12
LexiTable.Rows[45] has 4 segments:-6-10-11-12
LexiTable.Rows[46] has 1 segments:0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,10
复制代码


它是第四个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\LexicalAnalyzer文件夹挪到了Json.gen文件夹下。


Json.LexicalScripts.gen.cs


这是各个词法分析状态都可能用到的函数,包括 3 类:BeginExtendAccept。其作用是:记录一个 token 的起始位置(Begin)和结束位置(Extend),设置其类型、行数、列数等信息,将其加入List<Token> tokens数组(Accept)。


Json.LexicalScripts.gen.cs


using System;using System.Collections.Generic;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { // this is where new <see cref="Token"/> starts. private static void BeginToken(LexicalContext context) { if (context.analyzingToken.type != AnalyzingToken.NotYet) { context.analyzingToken.Reset(index: context.result.Count, start: context.cursor); } }
// extend value of current token(<see cref="LexicalContext.analyzingToken"/>) private static void ExtendToken(LexicalContext context, int Vt) { context.analyzingToken.ends[Vt] = context.cursor; } private static void ExtendToken2(LexicalContext context, params int[] Vts) { for (int i = 0; i < Vts.Length; i++) { var Vt = Vts[i]; context.analyzingToken.ends[Vt] = context.cursor; } } private static void ExtendToken3(LexicalContext context, params IfVt[] ifVts) { for (int i = 0; i < ifVts.Length; i++) { var Vt = ifVts[i].Vt; context.analyzingToken.ends[Vt] = context.cursor; } }
// accept current Token // set Token.type and neutralize the last LexicalContext.MoveForward() private static void AcceptToken(LexicalContext context, int Vt) { var startIndex = context.analyzingToken.start.index; var end = context.analyzingToken.ends[Vt]; context.analyzingToken.value = context.sourceCode.Substring( startIndex, end.index - startIndex + 1); context.analyzingToken.type = Vt;
// cancel forward steps for post-regex var backStep = context.cursor.index - end.index; if (backStep > 0) { context.MoveBack(backStep); } // next operation: LexicalContext.MoveForward();
var token = context.analyzingToken.Dump(#if DEBUG context.stArray,#endif end); context.result.Add(token); // 没有注释可跳过 no comment to skip context.lastSyntaxValidToken = token; if (token.type == st.Error错) { context.result.token2ErrorInfo.Add(token, new TokenErrorInfo(token, "token type unrecognized!")); } } private static void AcceptToken2(LexicalContext context, params int[] Vts) { AcceptToken(context, Vts[0]); } private static void AcceptToken3(LexicalContext context, params IfVt[] ifVts) { var typeSet = false; int lastType = st.@终; if (context.lastSyntaxValidToken != null) { lastType = context.lastSyntaxValidToken.type; } for (var i = 0; i < ifVts.Length; i++) { var ifVt = ifVts[i]; if (ifVt.signalCondition == context.signalCondition // if preVt is string.Empty, let's use the first type. // otherwise, preVt must be the lastType. && (ifVt.preVt == st.@终 // default preVt || ifVt.preVt == lastType)) { // <'Vt'> context.analyzingToken.type = ifVt.Vt; if (ifVt.nextSignal != null) { context.signalCondition = ifVt.nextSignal; } typeSet = true; break; } } if (!typeSet) { for (var i = 0; i < ifVts.Length; i++) { var ifVt = ifVts[i]; if (// ingnore signal condition and try to assgin a type. // if preVt is string.Empty, let's use the first type. // otherwise, preVt must be the lastType. (ifVt.preVt == st.@终 // default preVt || ifVt.preVt == lastType)) { // <'Vt'> context.analyzingToken.type = ifVt.Vt; context.signalCondition = LexicalContext.defaultSignal; typeSet = true; break; } } }
var startIndex = context.analyzingToken.start.index; var end = context.analyzingToken.start; if (!typeSet) { // we failed to assign type according to lexi statements. // this indicates token error in source code or inappropriate lexi statements. //throw new Exception("Algorithm error: token type not set!"); context.analyzingToken.type = st.Error错; context.signalCondition = LexicalContext.defaultSignal; // choose longest value for (int i = 0; i < context.analyzingToken.ends.Length; i++) { var item = context.analyzingToken.ends[i]; if (end.index < item.index) { end = item; } } } else { end = context.analyzingToken.ends[context.analyzingToken.type]; } context.analyzingToken.value = context.sourceCode.Substring(startIndex, end.index - startIndex + 1);
// cancel forward steps for post-regex var backStep = context.cursor.index - end.index; if (backStep > 0) { context.MoveBack(backStep); } // next operation: context.MoveForward();
var token = context.analyzingToken.Dump(#if DEBUG context.stArray,#endif end); context.result.Add(token); // 没有注释可跳过 no comment to skip context.lastSyntaxValidToken = token; if (token.type == st.Error错) { context.result.token2ErrorInfo.Add(token, new TokenErrorInfo(token, "token type unrecognized!")); } } }}
复制代码


Json.LexicalReservedWords.gen.cs


这里记录了 Json 文法的全部保留字(任何编程语言中的 keyword),也就是{}[],:nulltruefalse这些。显然这是辅助的东西,不必在意。


Json.LexicalReservedWords.gen.cs


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson {
public static class reservedWord { /// <summary> /// { /// </summary> public const string @LeftBrace符 = "{"; /// <summary> /// } /// </summary> public const string @RightBrace符 = "}"; /// <summary> /// [ /// </summary> public const string @LeftBracket符 = "["; /// <summary> /// ] /// </summary> public const string @RightBracket符 = "]"; /// <summary> /// , /// </summary> public const string @Comma符 = ","; /// <summary> /// : /// </summary> public const string @Colon符 = ":"; /// <summary> /// null /// </summary> public const string @null = "null"; /// <summary> /// true /// </summary> public const string @true = "true"; /// <summary> /// false /// </summary> public const string @false = "false";
}
/// <summary> /// if <paramref name="token"/> is a reserved word, assign correspond type and return true. /// <para>otherwise, return false.</para> /// </summary> /// <param name="token"></param> /// <returns></returns> private static bool CheckReservedWord(AnalyzingToken token) { bool isReservedWord = true; switch (token.value) { case reservedWord.@LeftBrace符: token.type = st.@LeftBrace符; break; case reservedWord.@RightBrace符: token.type = st.@RightBrace符; break; case reservedWord.@LeftBracket符: token.type = st.@LeftBracket符; break; case reservedWord.@RightBracket符: token.type = st.@RightBracket符; break; case reservedWord.@Comma符: token.type = st.@Comma符; break; case reservedWord.@Colon符: token.type = st.@Colon符; break; case reservedWord.@null: token.type = st.@null; break; case reservedWord.@true: token.type = st.@true; break; case reservedWord.@false: token.type = st.@false; break;
default: isReservedWord = false; break; }
return isReservedWord; } }}
复制代码


README.gen.md


这是词法分析器的说明文档,用 mermaid 画出了各个 token 的状态机和整个文法的总状态机,如下图所示。



我知道你们看不清。我也看不清。找个大屏幕直接看 README.gen.md 文件吧。


生成的语法分析器



Dicitonary<int, LRParseAction>


Json.Dict.LALR(1).gen.cs_是 LALR(1)的语法分析状态机,每个语法状态都是一个Dicitonary<int, LRParseAction>对象。


Json.Dict.LALR(1).gen.cs_


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson {
private static Dictionary<int, LRParseAction>[] InitializeSyntaxStates() { const int syntaxStateCount = 29; var states = new Dictionary<int, LRParseAction>[syntaxStateCount]; // 102 actions // conflicts(0)=not sovled(0)+solved(0)(0 warnings) #region create objects of syntax states states[0] = new(capacity: 5); states[1] = new(capacity: 1); states[2] = new(capacity: 1); states[3] = new(capacity: 1); states[4] = new(capacity: 4); states[5] = new(capacity: 13); states[6] = new(capacity: 4); states[7] = new(capacity: 2); states[8] = new(capacity: 2); states[9] = new(capacity: 1); states[10] = new(capacity: 4); states[11] = new(capacity: 2); states[12] = new(capacity: 2); states[13] = new(capacity: 2); states[14] = new(capacity: 3); states[15] = new(capacity: 3); states[16] = new(capacity: 3); states[17] = new(capacity: 3); states[18] = new(capacity: 3); states[19] = new(capacity: 3); states[20] = new(capacity: 3); states[21] = new(capacity: 4); states[22] = new(capacity: 2); states[23] = new(capacity: 10); states[24] = new(capacity: 4); states[25] = new(capacity: 11); states[26] = new(capacity: 2); states[27] = new(capacity: 2); states[28] = new(capacity: 2); #endregion create objects of syntax states
#region re-used actions LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times LRParseAction aReduce2 = new(regulations[2]);// refered 4 times LRParseAction aReduce7 = new(regulations[7]);// refered 2 times LRParseAction aReduce4 = new(regulations[4]);// refered 4 times LRParseAction aReduce9 = new(regulations[9]);// refered 2 times LRParseAction aReduce11 = new(regulations[11]);// refered 2 times LRParseAction aReduce12 = new(regulations[12]);// refered 3 times LRParseAction aReduce13 = new(regulations[13]);// refered 3 times LRParseAction aReduce14 = new(regulations[14]);// refered 3 times LRParseAction aReduce15 = new(regulations[15]);// refered 3 times LRParseAction aReduce16 = new(regulations[16]);// refered 3 times LRParseAction aReduce17 = new(regulations[17]);// refered 3 times LRParseAction aReduce18 = new(regulations[18]);// refered 3 times LRParseAction aReduce3 = new(regulations[3]);// refered 4 times LRParseAction aReduce5 = new(regulations[5]);// refered 4 times LRParseAction aReduce6 = new(regulations[6]);// refered 2 times LRParseAction aReduce10 = new(regulations[10]);// refered 2 times LRParseAction aReduce8 = new(regulations[8]);// refered 2 times #endregion re-used actions
// 102 actions // conflicts(0)=not sovled(0)+solved(0)(0 warnings) #region init actions of syntax states // syntaxStates[0]: // [-1] Json' : ⏳ Json ;☕ '¥' // [0] Json : ⏳ Object ;☕ '¥' // [1] Json : ⏳ Array ;☕ '¥' // [2] Object : ⏳ '{' '}' ;☕ '¥' // [3] Object : ⏳ '{' Members '}' ;☕ '¥' // [4] Array : ⏳ '[' ']' ;☕ '¥' // [5] Array : ⏳ '[' Elements ']' ;☕ '¥' /*0*/states[0].Add(st.Json枝, new(LRParseAction.Kind.Goto, states[1])); /*1*/states[0].Add(st.Object枝, new(LRParseAction.Kind.Goto, states[2])); /*2*/states[0].Add(st.Array枝, new(LRParseAction.Kind.Goto, states[3])); /*3*/states[0].Add(st.@LeftBrace符, aShift4); /*4*/states[0].Add(st.@LeftBracket符, aShift5); // syntaxStates[1]: // [-1] Json' : Json ⏳ ;☕ '¥' /*5*/states[1].Add(st.@终, LRParseAction.accept); // syntaxStates[2]: // [0] Json : Object ⏳ ;☕ '¥' /*6*/states[2].Add(st.@终, new(regulations[0])); // syntaxStates[3]: // [1] Json : Array ⏳ ;☕ '¥' /*7*/states[3].Add(st.@终, new(regulations[1])); // syntaxStates[4]: // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥' // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥' // [6] Members : ⏳ Members ',' Member ;☕ ',' '}' // [7] Members : ⏳ Member ;☕ ',' '}' // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' /*8*/states[4].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[6])); /*9*/states[4].Add(st.Members枝, new(LRParseAction.Kind.Goto, states[7])); /*10*/states[4].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[8])); /*11*/states[4].Add(st.@string, aShift9); // syntaxStates[5]: // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥' // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥' // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']' // [9] Elements : ⏳ Element ;☕ ',' ']' // [11] Element : ⏳ Value ;☕ ',' ']' // [12] Value : ⏳ 'null' ;☕ ',' ']' // [13] Value : ⏳ 'true' ;☕ ',' ']' // [14] Value : ⏳ 'false' ;☕ ',' ']' // [15] Value : ⏳ 'number' ;☕ ',' ']' // [16] Value : ⏳ 'string' ;☕ ',' ']' // [17] Value : ⏳ Object ;☕ ',' ']' // [18] Value : ⏳ Array ;☕ ',' ']' // [2] Object : ⏳ '{' '}' ;☕ ',' ']' // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' // [4] Array : ⏳ '[' ']' ;☕ ',' ']' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' /*12*/states[5].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[10])); /*13*/states[5].Add(st.Elements枝, new(LRParseAction.Kind.Goto, states[11])); /*14*/states[5].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[12])); /*15*/states[5].Add(st.Value枝, aGoto13); /*16*/states[5].Add(st.@null, aShift14); /*17*/states[5].Add(st.@true, aShift15); /*18*/states[5].Add(st.@false, aShift16); /*19*/states[5].Add(st.@number, aShift17); /*20*/states[5].Add(st.@string, aShift18); /*21*/states[5].Add(st.Object枝, aGoto19); /*22*/states[5].Add(st.Array枝, aGoto20); /*23*/states[5].Add(st.@LeftBrace符, aShift4); /*24*/states[5].Add(st.@LeftBracket符, aShift5); // syntaxStates[6]: // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥' /*25*/states[6].Add(st.@Comma符, aReduce2); /*26*/states[6].Add(st.@RightBracket符, aReduce2); /*27*/states[6].Add(st.@RightBrace符, aReduce2); /*28*/states[6].Add(st.@终, aReduce2); // syntaxStates[7]: // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥' // [6] Members : Members ⏳ ',' Member ;☕ ',' '}' /*29*/states[7].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[21])); /*30*/states[7].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[22])); // syntaxStates[8]: // [7] Members : Member ⏳ ;☕ ',' '}' /*31*/states[8].Add(st.@Comma符, aReduce7); /*32*/states[8].Add(st.@RightBrace符, aReduce7); // syntaxStates[9]: // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}' /*33*/states[9].Add(st.@Colon符, new(LRParseAction.Kind.Shift, states[23])); // syntaxStates[10]: // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥' /*34*/states[10].Add(st.@Comma符, aReduce4); /*35*/states[10].Add(st.@RightBracket符, aReduce4); /*36*/states[10].Add(st.@RightBrace符, aReduce4); /*37*/states[10].Add(st.@终, aReduce4); // syntaxStates[11]: // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥' // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']' /*38*/states[11].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[24])); /*39*/states[11].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[25])); // syntaxStates[12]: // [9] Elements : Element ⏳ ;☕ ',' ']' /*40*/states[12].Add(st.@Comma符, aReduce9); /*41*/states[12].Add(st.@RightBracket符, aReduce9); // syntaxStates[13]: // [11] Element : Value ⏳ ;☕ ',' ']' /*42*/states[13].Add(st.@Comma符, aReduce11); /*43*/states[13].Add(st.@RightBracket符, aReduce11); // syntaxStates[14]: // [12] Value : 'null' ⏳ ;☕ ',' ']' '}' /*44*/states[14].Add(st.@Comma符, aReduce12); /*45*/states[14].Add(st.@RightBracket符, aReduce12); /*46*/states[14].Add(st.@RightBrace符, aReduce12); // syntaxStates[15]: // [13] Value : 'true' ⏳ ;☕ ',' ']' '}' /*47*/states[15].Add(st.@Comma符, aReduce13); /*48*/states[15].Add(st.@RightBracket符, aReduce13); /*49*/states[15].Add(st.@RightBrace符, aReduce13); // syntaxStates[16]: // [14] Value : 'false' ⏳ ;☕ ',' ']' '}' /*50*/states[16].Add(st.@Comma符, aReduce14); /*51*/states[16].Add(st.@RightBracket符, aReduce14); /*52*/states[16].Add(st.@RightBrace符, aReduce14); // syntaxStates[17]: // [15] Value : 'number' ⏳ ;☕ ',' ']' '}' /*53*/states[17].Add(st.@Comma符, aReduce15); /*54*/states[17].Add(st.@RightBracket符, aReduce15); /*55*/states[17].Add(st.@RightBrace符, aReduce15); // syntaxStates[18]: // [16] Value : 'string' ⏳ ;☕ ',' ']' '}' /*56*/states[18].Add(st.@Comma符, aReduce16); /*57*/states[18].Add(st.@RightBracket符, aReduce16); /*58*/states[18].Add(st.@RightBrace符, aReduce16); // syntaxStates[19]: // [17] Value : Object ⏳ ;☕ ',' ']' '}' /*59*/states[19].Add(st.@Comma符, aReduce17); /*60*/states[19].Add(st.@RightBracket符, aReduce17); /*61*/states[19].Add(st.@RightBrace符, aReduce17); // syntaxStates[20]: // [18] Value : Array ⏳ ;☕ ',' ']' '}' /*62*/states[20].Add(st.@Comma符, aReduce18); /*63*/states[20].Add(st.@RightBracket符, aReduce18); /*64*/states[20].Add(st.@RightBrace符, aReduce18); // syntaxStates[21]: // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥' /*65*/states[21].Add(st.@Comma符, aReduce3); /*66*/states[21].Add(st.@RightBracket符, aReduce3); /*67*/states[21].Add(st.@RightBrace符, aReduce3); /*68*/states[21].Add(st.@终, aReduce3); // syntaxStates[22]: // [6] Members : Members ',' ⏳ Member ;☕ ',' '}' // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' /*69*/states[22].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[26])); /*70*/states[22].Add(st.@string, aShift9); // syntaxStates[23]: // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}' // [12] Value : ⏳ 'null' ;☕ ',' '}' // [13] Value : ⏳ 'true' ;☕ ',' '}' // [14] Value : ⏳ 'false' ;☕ ',' '}' // [15] Value : ⏳ 'number' ;☕ ',' '}' // [16] Value : ⏳ 'string' ;☕ ',' '}' // [17] Value : ⏳ Object ;☕ ',' '}' // [18] Value : ⏳ Array ;☕ ',' '}' // [2] Object : ⏳ '{' '}' ;☕ ',' '}' // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}' // [4] Array : ⏳ '[' ']' ;☕ ',' '}' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}' /*71*/states[23].Add(st.Value枝, new(LRParseAction.Kind.Goto, states[27])); /*72*/states[23].Add(st.@null, aShift14); /*73*/states[23].Add(st.@true, aShift15); /*74*/states[23].Add(st.@false, aShift16); /*75*/states[23].Add(st.@number, aShift17); /*76*/states[23].Add(st.@string, aShift18); /*77*/states[23].Add(st.Object枝, aGoto19); /*78*/states[23].Add(st.Array枝, aGoto20); /*79*/states[23].Add(st.@LeftBrace符, aShift4); /*80*/states[23].Add(st.@LeftBracket符, aShift5); // syntaxStates[24]: // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥' /*81*/states[24].Add(st.@Comma符, aReduce5); /*82*/states[24].Add(st.@RightBracket符, aReduce5); /*83*/states[24].Add(st.@RightBrace符, aReduce5); /*84*/states[24].Add(st.@终, aReduce5); // syntaxStates[25]: // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']' // [11] Element : ⏳ Value ;☕ ',' ']' // [12] Value : ⏳ 'null' ;☕ ',' ']' // [13] Value : ⏳ 'true' ;☕ ',' ']' // [14] Value : ⏳ 'false' ;☕ ',' ']' // [15] Value : ⏳ 'number' ;☕ ',' ']' // [16] Value : ⏳ 'string' ;☕ ',' ']' // [17] Value : ⏳ Object ;☕ ',' ']' // [18] Value : ⏳ Array ;☕ ',' ']' // [2] Object : ⏳ '{' '}' ;☕ ',' ']' // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' // [4] Array : ⏳ '[' ']' ;☕ ',' ']' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' /*85*/states[25].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[28])); /*86*/states[25].Add(st.Value枝, aGoto13); /*87*/states[25].Add(st.@null, aShift14); /*88*/states[25].Add(st.@true, aShift15); /*89*/states[25].Add(st.@false, aShift16); /*90*/states[25].Add(st.@number, aShift17); /*91*/states[25].Add(st.@string, aShift18); /*92*/states[25].Add(st.Object枝, aGoto19); /*93*/states[25].Add(st.Array枝, aGoto20); /*94*/states[25].Add(st.@LeftBrace符, aShift4); /*95*/states[25].Add(st.@LeftBracket符, aShift5); // syntaxStates[26]: // [6] Members : Members ',' Member ⏳ ;☕ ',' '}' /*96*/states[26].Add(st.@Comma符, aReduce6); /*97*/states[26].Add(st.@RightBrace符, aReduce6); // syntaxStates[27]: // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}' /*98*/states[27].Add(st.@Comma符, aReduce10); /*99*/states[27].Add(st.@RightBrace符, aReduce10); // syntaxStates[28]: // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']' /*100*/states[28].Add(st.@Comma符, aReduce8); /*101*/states[28].Add(st.@RightBracket符, aReduce8); #endregion init actions of syntax states
return states; } }}
复制代码


另外 3 个Json.Dict.*.gen.cs_分别是 LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

这是最初的也是最直观的实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。


int[]+LRParseAction[]


Json.Table.LALR(1).gen.cs_是 LALR(1)的语法分析状态机,每个语法状态都是一个包含int[]LRParseAction[]的对象。这里的每个int[t]LRParseAction[t]合起来就代替了Dictionary<int, LRParseAction>对象的一个键值对(key/value),从而减少了内存占用,也稍微提升了运行效率。


Json.Table.LALR(1).gen.cs_

using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson {
private static LRParseState[] InitializeSyntaxStates() { const int syntaxStateCount = 29; var states = new LRParseState[syntaxStateCount]; // 102 actions // conflicts(0)=not sovled(0)+solved(0)(0 warnings) for (var i = 0; i < syntaxStateCount; i++) { states[i] = new(); }
#region re-used actions LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times LRParseAction aReduce2 = new(regulations[2]);// refered 4 times LRParseAction aReduce7 = new(regulations[7]);// refered 2 times LRParseAction aReduce4 = new(regulations[4]);// refered 4 times LRParseAction aReduce9 = new(regulations[9]);// refered 2 times LRParseAction aReduce11 = new(regulations[11]);// refered 2 times LRParseAction aReduce12 = new(regulations[12]);// refered 3 times LRParseAction aReduce13 = new(regulations[13]);// refered 3 times LRParseAction aReduce14 = new(regulations[14]);// refered 3 times LRParseAction aReduce15 = new(regulations[15]);// refered 3 times LRParseAction aReduce16 = new(regulations[16]);// refered 3 times LRParseAction aReduce17 = new(regulations[17]);// refered 3 times LRParseAction aReduce18 = new(regulations[18]);// refered 3 times LRParseAction aReduce3 = new(regulations[3]);// refered 4 times LRParseAction aReduce5 = new(regulations[5]);// refered 4 times LRParseAction aReduce6 = new(regulations[6]);// refered 2 times LRParseAction aReduce10 = new(regulations[10]);// refered 2 times LRParseAction aReduce8 = new(regulations[8]);// refered 2 times #endregion re-used actions
// 102 actions // conflicts(0)=not sovled(0)+solved(0)(0 warnings) #region init actions of syntax states // syntaxStates[0]: // [-1] Json' : ⏳ Json ;☕ '¥' // [0] Json : ⏳ Object ;☕ '¥' // [1] Json : ⏳ Array ;☕ '¥' // [2] Object : ⏳ '{' '}' ;☕ '¥' // [3] Object : ⏳ '{' Members '}' ;☕ '¥' // [4] Array : ⏳ '[' ']' ;☕ '¥' // [5] Array : ⏳ '[' Elements ']' ;☕ '¥' states[0].nodes = new int[] { /*0*/st.@LeftBrace符, // (1) -> aShift4 /*1*/st.@LeftBracket符, // (3) -> aShift5 /*2*/st.Json枝, // (12) -> new(LRParseAction.Kind.Goto, states[1]) /*3*/st.Object枝, // (13) -> new(LRParseAction.Kind.Goto, states[2]) /*4*/st.Array枝, // (14) -> new(LRParseAction.Kind.Goto, states[3]) }; states[0].actions = new LRParseAction[] { /*0*//* st.@LeftBrace符(1), */aShift4, /*1*//* st.@LeftBracket符(3), */aShift5, /*2*//* st.Json枝(12), */new(LRParseAction.Kind.Goto, states[1]), /*3*//* st.Object枝(13), */new(LRParseAction.Kind.Goto, states[2]), /*4*//* st.Array枝(14), */new(LRParseAction.Kind.Goto, states[3]), }; // syntaxStates[1]: // [-1] Json' : Json ⏳ ;☕ '¥' states[1].nodes = new int[] { /*5*/st.@终, // (0) -> LRParseAction.accept }; states[1].actions = new LRParseAction[] { /*5*//* st.@终(0), */LRParseAction.accept, }; // syntaxStates[2]: // [0] Json : Object ⏳ ;☕ '¥' states[2].nodes = new int[] { /*6*/st.@终, // (0) -> new(regulations[0]) }; states[2].actions = new LRParseAction[] { /*6*//* st.@终(0), */new(regulations[0]), }; // syntaxStates[3]: // [1] Json : Array ⏳ ;☕ '¥' states[3].nodes = new int[] { /*7*/st.@终, // (0) -> new(regulations[1]) }; states[3].actions = new LRParseAction[] { /*7*//* st.@终(0), */new(regulations[1]), }; // syntaxStates[4]: // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥' // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥' // [6] Members : ⏳ Members ',' Member ;☕ ',' '}' // [7] Members : ⏳ Member ;☕ ',' '}' // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' states[4].nodes = new int[] { /*8*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[6]) /*9*/st.@string, // (6) -> aShift9 /*10*/st.Members枝, // (15) -> new(LRParseAction.Kind.Goto, states[7]) /*11*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[8]) }; states[4].actions = new LRParseAction[] { /*8*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[6]), /*9*//* st.@string(6), */aShift9, /*10*//* st.Members枝(15), */new(LRParseAction.Kind.Goto, states[7]), /*11*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[8]), }; // syntaxStates[5]: // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥' // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥' // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']' // [9] Elements : ⏳ Element ;☕ ',' ']' // [11] Element : ⏳ Value ;☕ ',' ']' // [12] Value : ⏳ 'null' ;☕ ',' ']' // [13] Value : ⏳ 'true' ;☕ ',' ']' // [14] Value : ⏳ 'false' ;☕ ',' ']' // [15] Value : ⏳ 'number' ;☕ ',' ']' // [16] Value : ⏳ 'string' ;☕ ',' ']' // [17] Value : ⏳ Object ;☕ ',' ']' // [18] Value : ⏳ Array ;☕ ',' ']' // [2] Object : ⏳ '{' '}' ;☕ ',' ']' // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' // [4] Array : ⏳ '[' ']' ;☕ ',' ']' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' states[5].nodes = new int[] { /*12*/st.@LeftBrace符, // (1) -> aShift4 /*13*/st.@LeftBracket符, // (3) -> aShift5 /*14*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[10]) /*15*/st.@string, // (6) -> aShift18 /*16*/st.@null, // (8) -> aShift14 /*17*/st.@true, // (9) -> aShift15 /*18*/st.@false, // (10) -> aShift16 /*19*/st.@number, // (11) -> aShift17 /*20*/st.Object枝, // (13) -> aGoto19 /*21*/st.Array枝, // (14) -> aGoto20 /*22*/st.Elements枝, // (16) -> new(LRParseAction.Kind.Goto, states[11]) /*23*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[12]) /*24*/st.Value枝, // (19) -> aGoto13 }; states[5].actions = new LRParseAction[] { /*12*//* st.@LeftBrace符(1), */aShift4, /*13*//* st.@LeftBracket符(3), */aShift5, /*14*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[10]), /*15*//* st.@string(6), */aShift18, /*16*//* st.@null(8), */aShift14, /*17*//* st.@true(9), */aShift15, /*18*//* st.@false(10), */aShift16, /*19*//* st.@number(11), */aShift17, /*20*//* st.Object枝(13), */aGoto19, /*21*//* st.Array枝(14), */aGoto20, /*22*//* st.Elements枝(16), */new(LRParseAction.Kind.Goto, states[11]), /*23*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[12]), /*24*//* st.Value枝(19), */aGoto13, }; // syntaxStates[6]: // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥' states[6].nodes = new int[] { /*25*/st.@终, // (0) -> aReduce2 /*26*/st.@RightBrace符, // (2) -> aReduce2 /*27*/st.@RightBracket符, // (4) -> aReduce2 /*28*/st.@Comma符, // (5) -> aReduce2 }; states[6].actions = new LRParseAction[] { /*25*//* st.@终(0), */aReduce2, /*26*//* st.@RightBrace符(2), */aReduce2, /*27*//* st.@RightBracket符(4), */aReduce2, /*28*//* st.@Comma符(5), */aReduce2, }; // syntaxStates[7]: // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥' // [6] Members : Members ⏳ ',' Member ;☕ ',' '}' states[7].nodes = new int[] { /*29*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[21]) /*30*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[22]) }; states[7].actions = new LRParseAction[] { /*29*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[21]), /*30*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[22]), }; // syntaxStates[8]: // [7] Members : Member ⏳ ;☕ ',' '}' states[8].nodes = new int[] { /*31*/st.@RightBrace符, // (2) -> aReduce7 /*32*/st.@Comma符, // (5) -> aReduce7 }; states[8].actions = new LRParseAction[] { /*31*//* st.@RightBrace符(2), */aReduce7, /*32*//* st.@Comma符(5), */aReduce7, }; // syntaxStates[9]: // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}' states[9].nodes = new int[] { /*33*/st.@Colon符, // (7) -> new(LRParseAction.Kind.Shift, states[23]) }; states[9].actions = new LRParseAction[] { /*33*//* st.@Colon符(7), */new(LRParseAction.Kind.Shift, states[23]), }; // syntaxStates[10]: // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥' states[10].nodes = new int[] { /*34*/st.@终, // (0) -> aReduce4 /*35*/st.@RightBrace符, // (2) -> aReduce4 /*36*/st.@RightBracket符, // (4) -> aReduce4 /*37*/st.@Comma符, // (5) -> aReduce4 }; states[10].actions = new LRParseAction[] { /*34*//* st.@终(0), */aReduce4, /*35*//* st.@RightBrace符(2), */aReduce4, /*36*//* st.@RightBracket符(4), */aReduce4, /*37*//* st.@Comma符(5), */aReduce4, }; // syntaxStates[11]: // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥' // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']' states[11].nodes = new int[] { /*38*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[24]) /*39*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[25]) }; states[11].actions = new LRParseAction[] { /*38*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[24]), /*39*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[25]), }; // syntaxStates[12]: // [9] Elements : Element ⏳ ;☕ ',' ']' states[12].nodes = new int[] { /*40*/st.@RightBracket符, // (4) -> aReduce9 /*41*/st.@Comma符, // (5) -> aReduce9 }; states[12].actions = new LRParseAction[] { /*40*//* st.@RightBracket符(4), */aReduce9, /*41*//* st.@Comma符(5), */aReduce9, }; // syntaxStates[13]: // [11] Element : Value ⏳ ;☕ ',' ']' states[13].nodes = new int[] { /*42*/st.@RightBracket符, // (4) -> aReduce11 /*43*/st.@Comma符, // (5) -> aReduce11 }; states[13].actions = new LRParseAction[] { /*42*//* st.@RightBracket符(4), */aReduce11, /*43*//* st.@Comma符(5), */aReduce11, }; // syntaxStates[14]: // [12] Value : 'null' ⏳ ;☕ ',' ']' '}' states[14].nodes = new int[] { /*44*/st.@RightBrace符, // (2) -> aReduce12 /*45*/st.@RightBracket符, // (4) -> aReduce12 /*46*/st.@Comma符, // (5) -> aReduce12 }; states[14].actions = new LRParseAction[] { /*44*//* st.@RightBrace符(2), */aReduce12, /*45*//* st.@RightBracket符(4), */aReduce12, /*46*//* st.@Comma符(5), */aReduce12, }; // syntaxStates[15]: // [13] Value : 'true' ⏳ ;☕ ',' ']' '}' states[15].nodes = new int[] { /*47*/st.@RightBrace符, // (2) -> aReduce13 /*48*/st.@RightBracket符, // (4) -> aReduce13 /*49*/st.@Comma符, // (5) -> aReduce13 }; states[15].actions = new LRParseAction[] { /*47*//* st.@RightBrace符(2), */aReduce13, /*48*//* st.@RightBracket符(4), */aReduce13, /*49*//* st.@Comma符(5), */aReduce13, }; // syntaxStates[16]: // [14] Value : 'false' ⏳ ;☕ ',' ']' '}' states[16].nodes = new int[] { /*50*/st.@RightBrace符, // (2) -> aReduce14 /*51*/st.@RightBracket符, // (4) -> aReduce14 /*52*/st.@Comma符, // (5) -> aReduce14 }; states[16].actions = new LRParseAction[] { /*50*//* st.@RightBrace符(2), */aReduce14, /*51*//* st.@RightBracket符(4), */aReduce14, /*52*//* st.@Comma符(5), */aReduce14, }; // syntaxStates[17]: // [15] Value : 'number' ⏳ ;☕ ',' ']' '}' states[17].nodes = new int[] { /*53*/st.@RightBrace符, // (2) -> aReduce15 /*54*/st.@RightBracket符, // (4) -> aReduce15 /*55*/st.@Comma符, // (5) -> aReduce15 }; states[17].actions = new LRParseAction[] { /*53*//* st.@RightBrace符(2), */aReduce15, /*54*//* st.@RightBracket符(4), */aReduce15, /*55*//* st.@Comma符(5), */aReduce15, }; // syntaxStates[18]: // [16] Value : 'string' ⏳ ;☕ ',' ']' '}' states[18].nodes = new int[] { /*56*/st.@RightBrace符, // (2) -> aReduce16 /*57*/st.@RightBracket符, // (4) -> aReduce16 /*58*/st.@Comma符, // (5) -> aReduce16 }; states[18].actions = new LRParseAction[] { /*56*//* st.@RightBrace符(2), */aReduce16, /*57*//* st.@RightBracket符(4), */aReduce16, /*58*//* st.@Comma符(5), */aReduce16, }; // syntaxStates[19]: // [17] Value : Object ⏳ ;☕ ',' ']' '}' states[19].nodes = new int[] { /*59*/st.@RightBrace符, // (2) -> aReduce17 /*60*/st.@RightBracket符, // (4) -> aReduce17 /*61*/st.@Comma符, // (5) -> aReduce17 }; states[19].actions = new LRParseAction[] { /*59*//* st.@RightBrace符(2), */aReduce17, /*60*//* st.@RightBracket符(4), */aReduce17, /*61*//* st.@Comma符(5), */aReduce17, }; // syntaxStates[20]: // [18] Value : Array ⏳ ;☕ ',' ']' '}' states[20].nodes = new int[] { /*62*/st.@RightBrace符, // (2) -> aReduce18 /*63*/st.@RightBracket符, // (4) -> aReduce18 /*64*/st.@Comma符, // (5) -> aReduce18 }; states[20].actions = new LRParseAction[] { /*62*//* st.@RightBrace符(2), */aReduce18, /*63*//* st.@RightBracket符(4), */aReduce18, /*64*//* st.@Comma符(5), */aReduce18, }; // syntaxStates[21]: // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥' states[21].nodes = new int[] { /*65*/st.@终, // (0) -> aReduce3 /*66*/st.@RightBrace符, // (2) -> aReduce3 /*67*/st.@RightBracket符, // (4) -> aReduce3 /*68*/st.@Comma符, // (5) -> aReduce3 }; states[21].actions = new LRParseAction[] { /*65*//* st.@终(0), */aReduce3, /*66*//* st.@RightBrace符(2), */aReduce3, /*67*//* st.@RightBracket符(4), */aReduce3, /*68*//* st.@Comma符(5), */aReduce3, }; // syntaxStates[22]: // [6] Members : Members ',' ⏳ Member ;☕ ',' '}' // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' states[22].nodes = new int[] { /*69*/st.@string, // (6) -> aShift9 /*70*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[26]) }; states[22].actions = new LRParseAction[] { /*69*//* st.@string(6), */aShift9, /*70*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[26]), }; // syntaxStates[23]: // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}' // [12] Value : ⏳ 'null' ;☕ ',' '}' // [13] Value : ⏳ 'true' ;☕ ',' '}' // [14] Value : ⏳ 'false' ;☕ ',' '}' // [15] Value : ⏳ 'number' ;☕ ',' '}' // [16] Value : ⏳ 'string' ;☕ ',' '}' // [17] Value : ⏳ Object ;☕ ',' '}' // [18] Value : ⏳ Array ;☕ ',' '}' // [2] Object : ⏳ '{' '}' ;☕ ',' '}' // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}' // [4] Array : ⏳ '[' ']' ;☕ ',' '}' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}' states[23].nodes = new int[] { /*71*/st.@LeftBrace符, // (1) -> aShift4 /*72*/st.@LeftBracket符, // (3) -> aShift5 /*73*/st.@string, // (6) -> aShift18 /*74*/st.@null, // (8) -> aShift14 /*75*/st.@true, // (9) -> aShift15 /*76*/st.@false, // (10) -> aShift16 /*77*/st.@number, // (11) -> aShift17 /*78*/st.Object枝, // (13) -> aGoto19 /*79*/st.Array枝, // (14) -> aGoto20 /*80*/st.Value枝, // (19) -> new(LRParseAction.Kind.Goto, states[27]) }; states[23].actions = new LRParseAction[] { /*71*//* st.@LeftBrace符(1), */aShift4, /*72*//* st.@LeftBracket符(3), */aShift5, /*73*//* st.@string(6), */aShift18, /*74*//* st.@null(8), */aShift14, /*75*//* st.@true(9), */aShift15, /*76*//* st.@false(10), */aShift16, /*77*//* st.@number(11), */aShift17, /*78*//* st.Object枝(13), */aGoto19, /*79*//* st.Array枝(14), */aGoto20, /*80*//* st.Value枝(19), */new(LRParseAction.Kind.Goto, states[27]), }; // syntaxStates[24]: // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥' states[24].nodes = new int[] { /*81*/st.@终, // (0) -> aReduce5 /*82*/st.@RightBrace符, // (2) -> aReduce5 /*83*/st.@RightBracket符, // (4) -> aReduce5 /*84*/st.@Comma符, // (5) -> aReduce5 }; states[24].actions = new LRParseAction[] { /*81*//* st.@终(0), */aReduce5, /*82*//* st.@RightBrace符(2), */aReduce5, /*83*//* st.@RightBracket符(4), */aReduce5, /*84*//* st.@Comma符(5), */aReduce5, }; // syntaxStates[25]: // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']' // [11] Element : ⏳ Value ;☕ ',' ']' // [12] Value : ⏳ 'null' ;☕ ',' ']' // [13] Value : ⏳ 'true' ;☕ ',' ']' // [14] Value : ⏳ 'false' ;☕ ',' ']' // [15] Value : ⏳ 'number' ;☕ ',' ']' // [16] Value : ⏳ 'string' ;☕ ',' ']' // [17] Value : ⏳ Object ;☕ ',' ']' // [18] Value : ⏳ Array ;☕ ',' ']' // [2] Object : ⏳ '{' '}' ;☕ ',' ']' // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' // [4] Array : ⏳ '[' ']' ;☕ ',' ']' // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' states[25].nodes = new int[] { /*85*/st.@LeftBrace符, // (1) -> aShift4 /*86*/st.@LeftBracket符, // (3) -> aShift5 /*87*/st.@string, // (6) -> aShift18 /*88*/st.@null, // (8) -> aShift14 /*89*/st.@true, // (9) -> aShift15 /*90*/st.@false, // (10) -> aShift16 /*91*/st.@number, // (11) -> aShift17 /*92*/st.Object枝, // (13) -> aGoto19 /*93*/st.Array枝, // (14) -> aGoto20 /*94*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[28]) /*95*/st.Value枝, // (19) -> aGoto13 }; states[25].actions = new LRParseAction[] { /*85*//* st.@LeftBrace符(1), */aShift4, /*86*//* st.@LeftBracket符(3), */aShift5, /*87*//* st.@string(6), */aShift18, /*88*//* st.@null(8), */aShift14, /*89*//* st.@true(9), */aShift15, /*90*//* st.@false(10), */aShift16, /*91*//* st.@number(11), */aShift17, /*92*//* st.Object枝(13), */aGoto19, /*93*//* st.Array枝(14), */aGoto20, /*94*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[28]), /*95*//* st.Value枝(19), */aGoto13, }; // syntaxStates[26]: // [6] Members : Members ',' Member ⏳ ;☕ ',' '}' states[26].nodes = new int[] { /*96*/st.@RightBrace符, // (2) -> aReduce6 /*97*/st.@Comma符, // (5) -> aReduce6 }; states[26].actions = new LRParseAction[] { /*96*//* st.@RightBrace符(2), */aReduce6, /*97*//* st.@Comma符(5), */aReduce6, }; // syntaxStates[27]: // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}' states[27].nodes = new int[] { /*98*/st.@RightBrace符, // (2) -> aReduce10 /*99*/st.@Comma符, // (5) -> aReduce10 }; states[27].actions = new LRParseAction[] { /*98*//* st.@RightBrace符(2), */aReduce10, /*99*//* st.@Comma符(5), */aReduce10, }; // syntaxStates[28]: // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']' states[28].nodes = new int[] { /*100*/st.@RightBracket符, // (4) -> aReduce8 /*101*/st.@Comma符, // (5) -> aReduce8 }; states[28].actions = new LRParseAction[] { /*100*//* st.@RightBracket符(4), */aReduce8, /*101*//* st.@Comma符(5), */aReduce8, }; #endregion init actions of syntax states
return states; } }}
复制代码


另外 4 个Json.Dict.*.gen.cs_分别是 LL(1)、LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

它是第二个实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。


Json.Table.*.gen.bin


与词法分析器类似,这是将数组形式(int[]+LRParseAction[])的语法分析表写入了一个二进制文件。加载 Json 解析器时,读取此文件即可得到数组形式(int[]+LRParseAction[])的语法分析表。这就不需要将整个语法分析表硬编码到源代码中了,从而进一步减少了内存占用。


为了方便调试、参考,我为其准备了对应的文本格式,例如 LALR(1)的语法分析表:


Json.Table.LALR(1).gen.txt


conflicts(0)=not sovled(0)+solved(0)(0 warnings)
29 states.28 re-used actions[0]:Shift[4] [1]:Shift[5] [2]:Shift[9] [3]:Goto[13] [4]:Shift[14] [5]:Shift[15] [6]:Shift[16] [7]:Shift[17] [8]:Shift[18] [9]:Goto[19] [10]:Goto[20] [11]:Reduce[2] [12]:Reduce[7] [13]:Reduce[4] [14]:Reduce[9] [15]:Reduce[11] [16]:Reduce[12] [17]:Reduce[13] [18]:Reduce[14] [19]:Reduce[15] [20]:Reduce[16] [21]:Reduce[17] [22]:Reduce[18] [23]:Reduce[3] [24]:Reduce[5] [25]:Reduce[6] [26]:Reduce[10] [27]:Reduce[8] states[0].nodes[5]:1 3 12 13 14 states[0].actions[5]:-4(0)Shift[4] -4(1)Shift[5] Goto[1] Goto[2] Goto[3] states[1].nodes[1]:0 states[1].actions[1]:Accept[0] states[2].nodes[1]:0 states[2].actions[1]:Reduce[0] states[3].nodes[1]:0 states[3].actions[1]:Reduce[1] states[4].nodes[4]:2 6 15 17 states[4].actions[4]:Shift[6] -2(2)Shift[9] Goto[7] Goto[8] states[5].nodes[13]:1 3 4 6 8 9 10 11 13 14 16 18 19 states[5].actions[13]:-4(0)Shift[4] -4(1)Shift[5] Shift[10] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[11] Goto[12] -2(3)Goto[13] states[6].nodes[4]:0 2 4 5 states[6].actions[4]:-4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] states[7].nodes[2]:2 5 states[7].actions[2]:Shift[21] Shift[22] states[8].nodes[2]:2 5 states[8].actions[2]:-2(12)Reduce[7] -2(12)Reduce[7] states[9].nodes[1]:7 states[9].actions[1]:Shift[23] states[10].nodes[4]:0 2 4 5 states[10].actions[4]:-4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] states[11].nodes[2]:4 5 states[11].actions[2]:Shift[24] Shift[25] states[12].nodes[2]:4 5 states[12].actions[2]:-2(14)Reduce[9] -2(14)Reduce[9] states[13].nodes[2]:4 5 states[13].actions[2]:-2(15)Reduce[11] -2(15)Reduce[11] states[14].nodes[3]:2 4 5 states[14].actions[3]:-3(16)Reduce[12] -3(16)Reduce[12] -3(16)Reduce[12] states[15].nodes[3]:2 4 5 states[15].actions[3]:-3(17)Reduce[13] -3(17)Reduce[13] -3(17)Reduce[13] states[16].nodes[3]:2 4 5 states[16].actions[3]:-3(18)Reduce[14] -3(18)Reduce[14] -3(18)Reduce[14] states[17].nodes[3]:2 4 5 states[17].actions[3]:-3(19)Reduce[15] -3(19)Reduce[15] -3(19)Reduce[15] states[18].nodes[3]:2 4 5 states[18].actions[3]:-3(20)Reduce[16] -3(20)Reduce[16] -3(20)Reduce[16] states[19].nodes[3]:2 4 5 states[19].actions[3]:-3(21)Reduce[17] -3(21)Reduce[17] -3(21)Reduce[17] states[20].nodes[3]:2 4 5 states[20].actions[3]:-3(22)Reduce[18] -3(22)Reduce[18] -3(22)Reduce[18] states[21].nodes[4]:0 2 4 5 states[21].actions[4]:-4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] states[22].nodes[2]:6 17 states[22].actions[2]:-2(2)Shift[9] Goto[26] states[23].nodes[10]:1 3 6 8 9 10 11 13 14 19 states[23].actions[10]:-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[27] states[24].nodes[4]:0 2 4 5 states[24].actions[4]:-4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] states[25].nodes[11]:1 3 6 8 9 10 11 13 14 18 19 states[25].actions[11]:-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[28] -2(3)Goto[13] states[26].nodes[2]:2 5 states[26].actions[2]:-2(25)Reduce[6] -2(25)Reduce[6] states[27].nodes[2]:2 5 states[27].actions[2]:-2(26)Reduce[10] -2(26)Reduce[10] states[28].nodes[2]:4 5 states[28].actions[2]:-2(27)Reduce[8] -2(27)Reduce[8]
复制代码


它是第三个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\SyntaxParser文件夹挪到了Json.gen文件夹下。


Json.Regulations.gen.cs_


这是一个数组,记录了整个 Json 文法的全部规则:


Json.Regulations.gen.cs_


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { public static readonly IReadOnlyList<Regulation> regulations = new Regulation[] { // [0] Json = Object ; new(0, st.Json枝, st.Object枝), // [1] Json = Array ; new(1, st.Json枝, st.Array枝), // [2] Object = '{' '}' ; new(2, st.Object枝, st.@LeftBrace符, st.@RightBrace符), // [3] Object = '{' Members '}' ; new(3, st.Object枝, st.@LeftBrace符, st.Members枝, st.@RightBrace符), // [4] Array = '[' ']' ; new(4, st.Array枝, st.@LeftBracket符, st.@RightBracket符), // [5] Array = '[' Elements ']' ; new(5, st.Array枝, st.@LeftBracket符, st.Elements枝, st.@RightBracket符), // [6] Members = Members ',' Member ; new(6, st.Members枝, st.Members枝, st.@Comma符, st.Member枝), // [7] Members = Member ; new(7, st.Members枝, st.Member枝), // [8] Elements = Elements ',' Element ; new(8, st.Elements枝, st.Elements枝, st.@Comma符, st.Element枝), // [9] Elements = Element ; new(9, st.Elements枝, st.Element枝), // [10] Member = 'string' ':' Value ; new(10, st.Member枝, st.@string, st.@Colon符, st.Value枝), // [11] Element = Value ; new(11, st.Element枝, st.Value枝), // [12] Value = 'null' ; new(12, st.Value枝, st.@null), // [13] Value = 'true' ; new(13, st.Value枝, st.@true), // [14] Value = 'false' ; new(14, st.Value枝, st.@false), // [15] Value = 'number' ; new(15, st.Value枝, st.@number), // [16] Value = 'string' ; new(16, st.Value枝, st.@string), // [17] Value = Object ; new(17, st.Value枝, st.Object枝), // [18] Value = Array ; new(18, st.Value枝, st.Array枝), }; }}
复制代码


为了减少内存占用,这个硬编码的实现方式也已经被一个二进制文件(Json.Regulations.gen.bin)取代了。现在此文件夹仅供学习参考用。因此我将 C#文件的扩展名 cs 改为 cs_,以免其被编译。

Json.Regulations.gen.bin 对应的文本格式


1912 = 1 (13)12 = 1 (14)13 = 2 (1 | 2)13 = 3 (1 | 15 | 2)14 = 2 (3 | 4)14 = 3 (3 | 16 | 4)15 = 3 (15 | 5 | 17)15 = 1 (17)16 = 3 (16 | 5 | 18)16 = 1 (18)17 = 3 (6 | 7 | 19)18 = 1 (19)19 = 1 (8)19 = 1 (9)19 = 1 (10)19 = 1 (11)19 = 1 (6)19 = 1 (13)19 = 1 (14)
复制代码


总而言之,如下所示:



生成的提取器


所谓提取,就是按后序优先遍历的顺序访问语法树的各个结点,在访问时提取出语义信息。


例如,{ "a": 0.3, "b": true, "a": "again" }的语法树是这样的:


R[0] Json = Object ;⛪T[0->12] └─R[3] Object = '{' Members '}' ;⛪T[0->12]    ├─T[0]='{' {    ├─R[6] Members = Members ',' Member ;⛪T[1->11]    │  ├─R[6] Members = Members ',' Member ;⛪T[1->7]    │  │  ├─R[7] Members = Member ;⛪T[1->3]    │  │  │  └─R[10] Member = 'string' ':' Value ;⛪T[1->3]    │  │  │     ├─T[1]='string' "a"    │  │  │     ├─T[2]=':' :    │  │  │     └─R[15] Value = 'number' ;⛪T[3]    │  │  │        └─T[3]='number' 0.3    │  │  ├─T[4]=',' ,    │  │  └─R[10] Member = 'string' ':' Value ;⛪T[5->7]    │  │     ├─T[5]='string' "b"    │  │     ├─T[6]=':' :    │  │     └─R[13] Value = 'true' ;⛪T[7]    │  │        └─T[7]='true' true    │  ├─T[8]=',' ,    │  └─R[10] Member = 'string' ':' Value ;⛪T[9->11]    │     ├─T[9]='string' "a"    │     ├─T[10]=':' :    │     └─R[16] Value = 'string' ;⛪T[11]    │        └─T[11]='string' "again"    └─T[12]='}' }
复制代码


按后序优先遍历的顺序,提取器会依次访问T[0]T[1]T[2]T[3]并将其入栈,然后访问R[15] Value = 'number' ;⛪T[3],此时应当:



// [15] Value = 'number' ;var r0 = (Token)context.rightStack.Pop();// T[3]出栈var left = new JsonValue(JsonValue.Kind.Number, r0.value);context.rightStack.Push(left);// Value入栈
复制代码


之后会访问R[10] Member = 'string' ':' Value ;⛪T[1->3],此时应当:


// [10] Member = 'string' ':' Value ;var r0 = (JsonValue)context.rightStack.Pop();// Value出栈var r1 = (Token)context.rightStack.Pop();// T[2]出栈var r2 = (Token)context.rightStack.Pop();// T[1]出栈var left = new JsonMember(key: r2.value, value: r0);context.rightStack.Push(left);// Member入栈
复制代码


这样逐步地访问到根节点R[0] Json = Object ;⛪T[0->12],此时应当:


var r0 = (List<JsonMember>)context.rightStack.Pop();// Member列表出栈var left = new Json(r0);context.rightStack.Push(left);// Json入栈
复制代码


这样,语法树访问完毕了,栈context.rightStack中有且只有 1 个对象,即最终的Json。此时应当:


// [-1] Json' = Json ;context.result = (Json)context.rightStack.Pop();
复制代码


提取器的完整代码 InitializeExtractorItems


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { /// <summary> /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/> /// </summary> private static readonly Action<LRNode, TContext<Json>>?[] @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];
/// <summary> /// initialize dict for extractor. /// </summary> private static void InitializeExtractorItems() { var extractorItems = @jsonExtractorItems;
#region obsolete //extractorDict.Add(st.NotYet, //(node, context) => { // not needed. //}); //extractorDict.Add(st.Error, //(node, context) => { // nothing to do. //}); //extractorDict.Add(st.blockComment, //(node, context) => { // not needed. //}); //extractorDict.Add(st.inlineComment, //(node, context) => { // not needed. //}); #endregion obsolete
extractorItems[st.@终/*0*/] = static (node, context) => { // [-1] Json' = Json ; // dumped by user-defined extractor context.result = (Json)context.rightStack.Pop(); }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... }; const int lexiVtCount = 11; extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 0: { // [0] Json = Object ; // dumped by user-defined extractor var r0 = (List<JsonMember>)context.rightStack.Pop(); var left = new Json(r0); context.rightStack.Push(left); } break; case 1: { // [1] Json = Array ; // dumped by user-defined extractor var r0 = (List<JsonValue>)context.rightStack.Pop(); var left = new Json(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 2: { // [2] Object = '{' '}' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new List<JsonMember>(); context.rightStack.Push(left); } break; case 3: { // [3] Object = '{' Members '}' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (List<JsonMember>)context.rightStack.Pop(); var r2 = (Token)context.rightStack.Pop();// reserved word is omissible var left = r1; context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 4: { // [4] Array = '[' ']' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new List<JsonValue>(); context.rightStack.Push(left); } break; case 5: { // [5] Array = '[' Elements ']' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (List<JsonValue>)context.rightStack.Pop(); var r2 = (Token)context.rightStack.Pop();// reserved word is omissible var left = r1; context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 6: { // [6] Members = Members ',' Member ; // dumped by user-defined extractor var r0 = (JsonMember)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (List<JsonMember>)context.rightStack.Pop(); var left = r2; left.Add(r0); context.rightStack.Push(left); } break; case 7: { // [7] Members = Member ; // dumped by user-defined extractor var r0 = (JsonMember)context.rightStack.Pop(); var left = new List<JsonMember>(); left.Add(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 8: { // [8] Elements = Elements ',' Element ; // dumped by user-defined extractor var r0 = (JsonValue)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (List<JsonValue>)context.rightStack.Pop(); var left = r2; left.Add(r0); context.rightStack.Push(left); } break; case 9: { // [9] Elements = Element ; // dumped by user-defined extractor var r0 = (JsonValue)context.rightStack.Pop(); var left = new List<JsonValue>(); left.Add(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 10: { // [10] Member = 'string' ':' Value ; // dumped by user-defined extractor var r0 = (JsonValue)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (Token)context.rightStack.Pop(); var left = new JsonMember(key: r2.value, value: r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... }; /* extractorItems[st.Element枝(18) - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 11: { // [11] Element = Value ; // dumped by DefaultExtractor // var r0 = (VnValue)context.rightStack.Pop(); // var left = new VnElement(r0); // context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Element枝(18) - lexiVtCount] = (node, context) => { ... }; */ extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 12: { // [12] Value = 'null' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop(); var left = new JsonValue(JsonValue.Kind.Null, r0.value); context.rightStack.Push(left); } break; case 13: { // [13] Value = 'true' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop(); var left = new JsonValue(JsonValue.Kind.True, r0.value); context.rightStack.Push(left); } break; case 14: { // [14] Value = 'false' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop(); var left = new JsonValue(JsonValue.Kind.False, r0.value); context.rightStack.Push(left); } break; case 15: { // [15] Value = 'number' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop(); var left = new JsonValue(JsonValue.Kind.Number, r0.value); context.rightStack.Push(left); } break; case 16: { // [16] Value = 'string' ; // dumped by user-defined extractor var r0 = (Token)context.rightStack.Pop(); var left = new JsonValue(JsonValue.Kind.String, r0.value); context.rightStack.Push(left); } break; case 17: { // [17] Value = Object ; // dumped by user-defined extractor var r0 = (List<JsonMember>)context.rightStack.Pop(); var left = new JsonValue(r0); context.rightStack.Push(left); } break; case 18: { // [18] Value = Array ; // dumped by user-defined extractor var r0 = (List<JsonValue>)context.rightStack.Pop(); var left = new JsonValue(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };
} }}
复制代码


不同的应用场景会要求不同的语义信息,因而一键生成的提取器代码不是这样的,而是仅仅将语法树压平了,并且保留了尽可能多的源代码信息,如下所示:


一键生成的提取器代码


using System;using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat { partial class CompilerJson { /// <summary> /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/> /// </summary> private static readonly Action<LRNode, TContext<Json>>?[] @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];
/// <summary> /// initialize dict for extractor. /// </summary> private static void InitializeExtractorItems() { var extractorItems = @jsonExtractorItems;
#region obsolete //extractorDict.Add(st.NotYet, //(node, context) => { // not needed. //}); //extractorDict.Add(st.Error, //(node, context) => { // nothing to do. //}); //extractorDict.Add(st.blockComment, //(node, context) => { // not needed. //}); //extractorDict.Add(st.inlineComment, //(node, context) => { // not needed. //}); #endregion obsolete
extractorItems[st.@终/*0*/] = static (node, context) => { // [-1] Json' = Json ; // dumped by ExternalExtractor var @final = (VnJson)context.rightStack.Pop(); var left = new Json(@final); context.result = left; // final step, no need to push into stack. }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... }; const int lexiVtCount = 11; extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 0: { // [0] Json = Object ; // dumped by InheritExtractor // class VnObject : VnJson var r0 = (VnObject)context.rightStack.Pop(); var left = r0; context.rightStack.Push(left); } break; case 1: { // [1] Json = Array ; // dumped by InheritExtractor // class VnArray : VnJson var r0 = (VnArray)context.rightStack.Pop(); var left = r0; context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 2: { // [2] Object = '{' '}' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnObject(r1, r0); context.rightStack.Push(left); } break; case 3: { // [3] Object = '{' Members '}' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (VnMembers)context.rightStack.Pop(); var r2 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnObject(r2, r1, r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 4: { // [4] Array = '[' ']' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnArray(r1, r0); context.rightStack.Push(left); } break; case 5: { // [5] Array = '[' Elements ']' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var r1 = (VnElements)context.rightStack.Pop(); var r2 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnArray(r2, r1, r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 6: { // [6] Members = Members ',' Member ; // dumped by ListExtractor 2 var r0 = (VnMember)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (VnMembers)context.rightStack.Pop(); var left = r2; left.Add(r1, r0); context.rightStack.Push(left); } break; case 7: { // [7] Members = Member ; // dumped by ListExtractor 1 var r0 = (VnMember)context.rightStack.Pop(); var left = new VnMembers(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 8: { // [8] Elements = Elements ',' Element ; // dumped by ListExtractor 2 var r0 = (VnElement)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (VnElements)context.rightStack.Pop(); var left = r2; left.Add(r1, r0); context.rightStack.Push(left); } break; case 9: { // [9] Elements = Element ; // dumped by ListExtractor 1 var r0 = (VnElement)context.rightStack.Pop(); var left = new VnElements(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 10: { // [10] Member = 'string' ':' Value ; // dumped by DefaultExtractor var r0 = (VnValue)context.rightStack.Pop(); var r1 = (Token)context.rightStack.Pop();// reserved word is omissible var r2 = (Token)context.rightStack.Pop(); var left = new VnMember(r2, r1, r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Element枝/*18*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 11: { // [11] Element = Value ; // dumped by DefaultExtractor var r0 = (VnValue)context.rightStack.Pop(); var left = new VnElement(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Element枝/*18*/ - lexiVtCount] = (node, context) => { ... }; extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => { switch (node.regulation.index) { case 12: { // [12] Value = 'null' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnValue(r0); context.rightStack.Push(left); } break; case 13: { // [13] Value = 'true' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnValue(r0); context.rightStack.Push(left); } break; case 14: { // [14] Value = 'false' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop();// reserved word is omissible var left = new VnValue(r0); context.rightStack.Push(left); } break; case 15: { // [15] Value = 'number' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop(); var left = new VnValue(r0); context.rightStack.Push(left); } break; case 16: { // [16] Value = 'string' ; // dumped by DefaultExtractor var r0 = (Token)context.rightStack.Pop(); var left = new VnValue(r0); context.rightStack.Push(left); } break; case 17: { // [17] Value = Object ; // dumped by DefaultExtractor var r0 = (VnObject)context.rightStack.Pop(); var left = new VnValue(r0); context.rightStack.Push(left); } break; case 18: { // [18] Value = Array ; // dumped by DefaultExtractor var r0 = (VnArray)context.rightStack.Pop(); var left = new VnValue(r0); context.rightStack.Push(left); } break; default: throw new NotImplementedException(); } }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };
} }}
复制代码


这是步子最小的保守式代码,程序员可以在此基础上继续开发,也可以自行编写访问各类型结点的提取动作。本应用场景的目的是尽可能高效地解析 Json 文本文件,因而完全自行编写了访问各类型结点的提取动作。


测试


测试用例 0


{}
复制代码


测试用例 1


[]
复制代码


测试用例 2

{ "a": 0.3 }
复制代码


测试用例 3


{  "a": 0.3,  "b": true}
复制代码


测试用例 4


{  "a": 0.3,  "b": true,  "a": "again"}
复制代码


测试用例 5


{  "a": 0.3,  "b": true,  "a": "again",  "array": [    1,    true,    null,    "str",    {      "t": 100,      "array2": [ false, 3.14, "tmp" ]    }  ]}
复制代码


上述测试用例都能够被 Json 解析器正确解析,也可以在(https://jsonlint.com/)验证。


调用 Json 解析器的代码如下:


var compiler = new bitzhuwei.JsonFormat.CompilerJson();var sourceCode = File.ReadAllText("xxx.json");var tokens = compiler.Analyze(sourceCode);var syntaxTree = compiler.Parse(tokens);var json = compiler.Extract(syntaxTree.root, tokens, sourceCode);// use json ...
复制代码


文章转载自:BIT祝威

原文链接:https://www.cnblogs.com/bitzhuwei/p/18779851

体验地址:http://www.jnpfsoft.com/?from=001YH

用户头像

还未添加个人签名 2023-06-19 加入

还未添加个人简介

评论

发布
暂无评论
C#实现自己的Json解析器(LALR(1)+miniDFA)_C#_不在线第一只蜗牛_InfoQ写作社区