New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup lexical structure of numbers and identifiers #403
Cleanup lexical structure of numbers and identifiers #403
Conversation
I am not sure why we should disallow decimal numbers as a notation for decimal numbers.
|
What I propose for numeric literals is what GHC does now already. I hope that is not controversial, otherwise you should fill a bug report to GHC issue tracker. EDIT: I mention in alternatives that other scripts can be recognized for decimal numbers, but I argue that only with |
Yes, I am aware of both your points, and I understand that it may not be worth the effort for you or me to actually write a parser for unusual numbers. But there is a difference between being lazy to do something and disallowing that thing altogether. I do not argue that we should immediately start parsing unusual numbers — I argue that we should leave the possibility open and inviting. An error message like «unusual numbers are not yet implemented» would be suitable. |
This is not the case with other Unicode syntax either today
Improving error messages don't need a proposal, IMHO. As my proposal accidentally shows, lexer errors are awful. I'm not against improving errors like:
They could be clearer, and may invite to vote for a feature. EDIT: the goal of this proposal is to document the status quo and fix one clear wart (not treating letter numbers as any kind of "letter"). |
I am not sure I follow. What is not the case with other Unicode syntax? Note also that I am not giving any comment as to whether |
There is. I arguet that having GHC accept evil :: Int
evil = ٦٦٦ without any extension enabled would confuse most Haskell programmers (and tools). |
This is not an argument specific to unusual numbers — it is a general argument about Unicode and |
What's the status of underscore ( But also there's extension
|
@AntC2, yes, underscore is a small character
Re, In particular, that proposal changes -decimal → digit{digit}
+decimal → digit{numSpacer digit} However the implementation was and is in terms of -decimal → ascDigit{ascDigit}
+decimal → ascDigit{numSpacer ascDigit} This proposal is compatible. I will mention Does this answer your concerns? |
Thanks. (No "concerns", just dotting i's, crossing t's, and scoring under's.) |
It's worth noting that GHC already treats Other Letter as lowercase, ever since https://gitlab.haskell.org/ghc/ghc/-/issues/3741, and this without any language pragma. What's missing, both from GHC and from the proposal, is a way to name a type or constructor using an uncased script. I had written a very preliminary proposal for Haskell 2020, but it never went anywhere. |
:Proposal text says
And notes that
meaning that everything else is already in GHC, but undocumented. I'll welcome suggestions how to proposal text clear. |
@blamario, I read through your proposal. It's much more ambitious, also it contains breaking change. Therefore I won't incorporate any parts of that here. I'm afraid to open the pandora box. |
That's all right, I didn't expect you to adopt it. I think that Haskell really needs to support the native scripts of approximately half the humanity, but I won't pretend my proposal solves the problem. Nowadays I think that a unified namespace as in Idris would be ideal, but I don't know how to get Haskell there.
Perhaps say something like the proposed You should also point to ticket #3741. |
I'm sorry, how #3741 is related?
Did you mean to mention some other issue? |
I got to that issue by following the history of the test file |
Probably https://gitlab.haskell.org/ghc/ghc/-/issues/1103, I'll add it to the list of issues. |
@nomeata (are you still acting as secretary?) I'd like to submit this proposal to the committee. |
I think you meant @nomeata |
As the shepherd, I'll review right after the Icfp deadline (in a little over week). |
/remind @aspiwack that the deadline is over in two weeks :-) |
@nomeata set a reminder for Mar 6th 2021 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, I am back as promised. And here is a bit of review before sending the proposal to the committee.
The one discussion that I find is missing a bit, and probably should find its way to the Alternatives section is why the Other Letter group is considered as small characters.
It would be natural to consider them as just idChar
, since these don't care about case. Ah, I believe you say, but then their is no way to write an identifier in Thai script, and that is kind of mean.
Which is fair, but should probably figure in the Alternatives section nonetheless. But maybe more interestingly: with your proposal, I still can't write a constructor name in Thai script, and that's kind of mean too. So why do you choose to favour varid
over conid
for scripts without case?
Co-authored-by: Arnaud Spiwack <arnaud@spiwack.net>
Yes. But that would be a change, breaking how GHC works now. So I don't propose it. I added a note to alternatives section. |
When implemented, the difference to Haskell98 should probably be documented in this part of the user’s guide: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/bugs.html |
I forgot to get back to here, but I did recommend acceptance to the committee. |
Yes. The lack of documentation is one of reasons to write this proposal. (And waiting quite long before writing to do it, as current state wasn't documented). |
Co-authored-by: Joachim Breitner <mail@joachim-breitner.de>
👋 @aspiwack, the deadline is over :-) |
The proposal has been accepted; the following discussion is mostly of historic interest.
This proposal cleanups and clarifies lexical structure of numbers and identifiers. (Contains Unicode inside).
Rendered