Stim compiler#3305
Conversation
amcasey
left a comment
There was a problem hiding this comment.
Some questions about the lexer. I'm just learning, so don't take any of this as blocking.
| }; | ||
|
|
||
| #[derive(Clone, Copy, Debug, Eq, PartialEq)] | ||
| pub struct Token { |
There was a problem hiding this comment.
Is there a concept of a known-but-erroneous token? For example, in many languages 0.2 is valid, but .2 is not, but you'd want both to appear as Double tokens for error recovery purposes. (This is almost certainly out of scope for this proof-of-concept implementation.)
| self.eat_while(|c| c.is_ascii_digit()); | ||
| let mut is_double = false; | ||
| if self.chars.next_if(|(_, c)| *c == '.').is_some() { | ||
| self.eat_while(|c| c.is_ascii_digit()); |
There was a problem hiding this comment.
Is this guaranteed to consume at least one digit? Or does the language allow 2. as a valid double?
There was a problem hiding this comment.
Looks like we allow scientific notation. Surely, 2.e isn't allowed?
| self.whitespace(); | ||
| } | ||
|
|
||
| fn scan_number(&mut self) -> TokenKind { |
There was a problem hiding this comment.
Can there be a sign for the whole number? +2?
There was a problem hiding this comment.
nope! numbers are very basic in stim, nothing fancy (not even computations)
There was a problem hiding this comment.
So, to confirm, there are no negative angles? You have to normalize to a positive angle?
| } | ||
|
|
||
| fn scan_identifier(&mut self, lo: usize) -> TokenKind { | ||
| self.eat_while(|c| c.is_alphanumeric() || c == '_'); |
There was a problem hiding this comment.
A lot of languages don't allow identifiers to start with digits. Not sure if that's true of stim.
There was a problem hiding this comment.
It is! per their grammar this is what a name can be: [a-zA-Z][a-zA-Z0-9_]*
granted the "identifier" concept isn't exactly from stim, but I used it to simplify the code. Will have to revisit it later for correctness, though
There was a problem hiding this comment.
The regex you provided doesn't seem to allow the identifier to start with a digit.
| .map_or(self.input_len as usize, |(i, _)| *i); | ||
| // TODO: What if some identifier starts with "rec" but is not a rec token? | ||
| match &self.input[lo..hi] { | ||
| "rec" => { |
There was a problem hiding this comment.
I'm probably just blanking, but where did we check for the open [?
There was a problem hiding this comment.
The three cases we could have [] are:
1- rec[...]
2- sweep[...]
3- tags! For example in the statement: X_ERRORa 3 4
But parsing the brackets individually added a ton of complexity to distinguish between these three cases, so I chose to just consume them as a whole with those tokens, and then strip them away for the content. Will also revisit this later!
There was a problem hiding this comment.
I'm not sure I understand the complexity. But it looks like maybe the single token includes the contents of the square brackets and isn't a keyword followed by punctuation, etc?
| while self.chars.next_if(|i| f(i.1)).is_some() {} | ||
| } | ||
|
|
||
| fn whitespace(&mut self) { |
There was a problem hiding this comment.
Is this going to consume newlines without creating corresponding tokens?
amcasey
left a comment
There was a problem hiding this comment.
Parser comments. Still non-blocking. Happy to chat if my questions don't make sense (which is reasonably likely).
|
|
||
| #[derive(Debug)] | ||
| pub struct Line { | ||
| pub span: Span, |
There was a problem hiding this comment.
Is this different from the span that's in the Instruction?
| None => break, | ||
| } | ||
| } | ||
| let closing_brace = self.expect(TokenKind::Close(Brace)); |
There was a problem hiding this comment.
In the future, we might want to synthesize a missing closing brace for recovery purposes.
| } | ||
|
|
||
| fn parse_line(&mut self, instruction: Instruction) -> Line { | ||
| self.expect(TokenKind::Newline); |
There was a problem hiding this comment.
Personally, I find it a little strange to start a line with a newline, rather than end it with a newline. Does this cause any problems at file boundaries?
| } | ||
|
|
||
| fn extract_uint(&mut self, token: Token, span: Option<Span>) -> u32 { | ||
| self.extract_string(token, span).parse().unwrap() |
There was a problem hiding this comment.
I think this panics if the number is too large to fit in a u32, for example?
| from typing import List, Literal, Optional, Tuple | ||
|
|
||
|
|
||
| def compile(src: str, noise: Optional[NoiseConfig]) -> Tuple[str, NoiseConfig]: |
There was a problem hiding this comment.
I'm not a fan of this returning a tuple. I keep forgetting to destructure the results and wondering why I have a list. Also qsharp.compile and openqasm.compile return a QirInputData. We should be consistent.
There was a problem hiding this comment.
(It's OK if we return the QIR string for now rather than QirInputData, but let's still just return the QIR and not a tuple)
e1767ae to
44f52b7
Compare
…rings' into joaoboechat/stim-compiler
| shot.unitary[1] = op.unitary[1]; | ||
| shot.unitary[4] = cplxNeg(op.unitary[4]); | ||
| shot.unitary[5] = cplxNeg(op.unitary[5]); | ||
| } else if (rand < (p_x + p_z + p_y)) { |
There was a problem hiding this comment.
Why did this get moved? Was there an issue to fix or optimization to be had?
This is an initial PR for the Stim compiler, which includes the setup for supporting the language. We still will need to add tests, better error handling, more language features, among other things, but the compiler introduced by this PR is supposed to be fully functional and minimally faulty.
All of the code was designed around the stim language, which is mostly defined by these two documents:
Stim/doc/file_format_stim_circuit.md at main · quantumlib/Stim
Stim/doc/gates.md at main · quantumlib/Stim