Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecordDotSyntax language extension proposal #282

Merged
merged 53 commits into from May 3, 2020
Merged

RecordDotSyntax language extension proposal #282

merged 53 commits into from May 3, 2020

Conversation

ghost
Copy link

@ghost ghost commented Oct 11, 2019

The proposal has been accepted; the following discussion is mostly of historic interest.


We propose a new language extension RecordDotSyntax that provides syntactic sugar to make the features introduced in the HasField proposal more accessible, improving the user experience.

Rendered

@ghost ghost closed this Oct 11, 2019
@simonpj
Copy link
Contributor

simonpj commented Oct 11, 2019

The link is broken in the rendered proposal, where it says "This proposal is discussed at this pull request"

@simonpj
Copy link
Contributor

simonpj commented Oct 11, 2019

I strongly support the direction of travel of this proposal. I've wanted to use dot-notation for record selection since forever; and this looks like a very plausible way to do so. The fact that it's been used extensively in a production context helps reassure me that there aren't unexpected consequences.

I'd like more clarity about white space.

  • f.x means getField @"lbl" f
  • f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?
  • f (.x) presumalby really does mean f (\r -> r.x)
  • f(.x) presumably means the same thing.

Perhaps the right way to think about it is that .x is a postfix operator. You cannot put any white space after the dot, but you can always put as much as much whitespace before the dot.

GHC already supports postfix operators: here is the manual section. It would be good to check that the proposal is compatible with treating .x as a postfix operator in the sense of that section.

@ghost ghost reopened this Oct 11, 2019
@ghost
Copy link
Author

ghost commented Oct 11, 2019

The link is broken in the rendered proposal, where it says "This proposal is discussed at this pull request"

So fast Simon! I was just fixing it 😉

@ghost
Copy link
Author

ghost commented Oct 11, 2019

You cannot put any white space after the dot, but you can always put as much as much whitespace before the dot.

Yes, that's how it behaves.

@ghost
Copy link
Author

ghost commented Oct 11, 2019

I'd like more clarity about white space.

  • f.x means getField @"lbl" f

Yes.

  • f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?

f .x is the same as f.x (that is, getField @"lbl" f).

  • f (.x) presumalby really does mean f (\r -> r.x)

Exactly.

  • f(.x) presumably means the same thing.

Yes.

@simonpj
Copy link
Contributor

simonpj commented Oct 11, 2019

Thanks. Perhaps in due course update the proposal to make these points clear.


Below are some possible variations on this plan, but we advocate the choices made above:

* Should `RecordDotSyntax` imply `NoFieldSelectors`? They are often likely to be used in conjunction, but they aren't inseparable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that it should not. Nothing about this proposal requires that selectors not exist, it is merely helpful to avoid clashes (for which DuplicateRecordFields would work pretty much equivalently).

If a consensus emerges that RecordDotSyntax + NoFieldSelectors + ... is the right way forward, we could easily add a new extension Haskell2030Records that implies the conjunction.

@adamgundry
Copy link
Contributor

In addition to the lack of type-changing updates, the HasField approach is limited compared to existing selector functions in that it cannot support higher-rank fields. (This is arguably more related to the HasField proposal, but perhaps worth flagging up in this context as well.) For example, given

data T = MkT { foo :: forall a . a -> a }

then foo (MkT id) is well-typed but getField @"foo" (MkT id) is not.

Perhaps this use is rare enough that users can re-enable selector function generation (or define their own selectors) in this case.

@phadej
Copy link
Contributor

phadej commented Oct 11, 2019

{-# LANGUAGE RecordDotSyntax #-}

import qualified Foo

data Foo = Foo

instance HasField "name" Foo () where hasField = ...

something = ... Foo.name

is that Foo.name

  • a name from Foo module
  • or getField @"name" Foo

EDIT:

Foo.name  -- name from module Name
Foo .name -- getField ...

Proposal should point this out.

@nomeata
Copy link
Contributor

nomeata commented Oct 11, 2019

Perhaps this use is rare enough that users can re-enable selector function generation (or define their own selectors) in this case.

It may be rare, but it turned out to be a blocker for various ways of fixing #216.

@parsonsmatt
Copy link
Contributor

The loss of polymorphic update is a huge problem, imo.

@ghost
Copy link
Author

ghost commented Oct 11, 2019

Foo.name

This parses as a qualified variable, not field selection.

@int-index
Copy link
Contributor

int-index commented Oct 11, 2019

The e{lbl * val} syntax is a bit perplexing to me. Firstly, it has no = sign, making it hard to recognize at first that there's an update going on here. Secondly, it makes commutative operators non-commutative. That is, the following two lines are not equivalent:

  1. c{taken.year + n}
  2. c{n + taken.year}

I would propose that we introduce a e{lbl * = val} syntax instead. The examples from the proposal would look like:

addYears :: Class -> Int -> Class
addYears c n = c{taken.year + = n} -- update via op

squareUnits :: Class -> Class
squareUnits c = c{units & = \x -> x * x} -- update via function

It would also nicely parallel the C++ syntax +=, -=, *=, etc, differing only in whitespace.

@phadej
Copy link
Contributor

phadej commented Oct 11, 2019

The getField part looks not like PostfixOperators but likeTypeApplications

foo @Int .field1 @Bar .field2

I can imagine having an Overloaded:RecordDots option in overloaded plugin, as val .name could be desugared into whatever @"name" val, not only getLabel

So for me, the getField part of this proposal is simply introducing some new expression syntax. More things to play with.

@phadej
Copy link
Contributor

phadej commented Oct 11, 2019

In #282 (comment) the

f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?

f .x is the same as f.x (that is, getField @"lbl" f).

The proposal should somehow specify how juxtaposition works now, it looks like some will be more binding than another:

someFun record .field .field2 value

should probably be parsed the same as

someFun record.field.field2 value

i.e.

(someFun ((record.field).field2)) value

Should or shouldn't be there warnings for omitted front space? It feels like "don't use tabs" thing.

@ghost
Copy link
Author

ghost commented Oct 11, 2019

Foo.name  -- name from module Name
Foo .name -- getField ...

Proposal should point this out.

I shall take care of it.

@ghost
Copy link
Author

ghost commented Oct 11, 2019

The proposal should somehow specify how juxtaposition works now, it looks like some will be more binding than another:

someFun record .field .field2 value

should probably be parsed the same as

someFun record.field.field2 value

i.e.

(someFun ((record.field).field2)) value

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent. Update : #282 (comment)

@phadej
Copy link
Contributor

phadej commented Oct 11, 2019

As it's implemented in the prototype, function application is taking precedence over field projection

That's definitely should be pointed out in the proposal. It's an opposite of what I thought it is.

@ghost
Copy link
Author

ghost commented Oct 11, 2019

That's definitely should be pointed out in the proposal.

Maybe best to go to the unresolved questions section. Hard to call the "right" parse here.

@gbaz
Copy link

gbaz commented Oct 11, 2019

In my opinion, this proposal leaves at least one huge thing to be desired. It proves an "alternate route" to compositional projection (besides generating projection functions directly) by having the . denote a projection function. However it does not provide any route to compositional update. In our experience (at awake), this feature is key. In particular, having sugar for both makes the "stealing" of the syntax from idiomatic lens usage much less painful.

Here's one way to do it. Just as (.lbl) expands to (\x -> x.lbl), have:

(.=lbl) ==> (\x v -> x{lbl = v})

[of course, pick your exact syntax poison of choice -- &= is another good candidate].

One nice thing here is you can even parse the "assignment" as an infix operator if desired, so x .=lbl v reads as x with lbl assigned to value v.

This could also desugar to using SetField if you prefer. However, I confess by the way I don't understand why SetField is used at all in this proposal?

I.e., it has e{lbl = val} ==> setField @"lbl" e val. But why not just leave it as is? The subsequent (nested) desugaring e{lbl1.lbl2 = val} ==> e{lbl1 = (e.lbl1){lbl2 = val} would still work correctly, no? And furthermore, if SetField isn't used, then doesn't polymorphic record update still work?

@simonpj
Copy link
Contributor

simonpj commented Oct 11, 2019

I shall take care of it.

It'd be great to update the proposal to cover all the syntactic questions here, so that it stands by itself without reading the discussion thread. For example

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent.

Make sure the proposal says all this!

@ghost
Copy link
Author

ghost commented Oct 11, 2019

I shall take care of it.

It'd be great to update the proposal to cover all the syntactic questions here, so that it stands by itself without reading the discussion thread. For example

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent.

Make sure the proposal says all this!

Understood Simon. On it... Done.

@cocreature
Copy link

This could also desugar to using SetField if you prefer. However, I confess by the way I don't understand why SetField is used at all in this proposal?

This proposal tries to solve two things at once (which could probably be pointed out more clearly in the proposal):

  1. The use of dot-syntax and together with that an easy way to deal with nested fields. This is independent of SetField.
  2. A better solution to colliding field names than DuplicateRecordFields. This depends on SetField.

@phadej
Copy link
Contributor

phadej commented Oct 11, 2019

@gbaz, one value of desugaring to a something using class, is that one can write manual instances to the class. I.e. you can do "Classy Lenses" stuff. Compare with writing "foo" with and without {-# language OverloadedStrings #-}. Yet, {-# language OverloadedRecordDots #-] would be too much, i.e.

  • plain {-# language RecordDotSyntax #-} would desugar to selector functors and record updates and
  • {-# language RecordDotSyntax, OverloadedRecordDotSyntax #-} would desugar to a type-class members

It's silly to be so granular here, but OTOH there are various things happening: new syntax, and overloading that syntax with existing HasField functionality.


@cocreature was first, and I agree

This proposal tries to solve two things at once (which could probably be pointed out more clearly in the proposal)

@goldfirere
Copy link
Contributor

When RecordDotSyntax is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'.

In the language of https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0229-whitespace-bang-patterns.rst, . will be a prefix operator.


In the bit about update syntax of pbind, why does qvar appear in one production but just var in the others?


@parsonsmatt

The loss of polymorphic update is a huge problem, imo.

Why? Have you used polymorphic update? I agree that it's nice and compositional to have polymorphic update, but is it actually useful in practice?


For update sections (along the lines of #282 (comment)), I humbly submit (.lbl =) as the syntax. (Whitespace before the = not significant, but I like it.) The proposals from @gbaz above use .= and &=, which are both valid operators.

@fsoikin
Copy link

fsoikin commented Oct 11, 2019

I'm a bit worried about polymorphic fields. I agree it's not a frequent use case, but at least I'd very much like to have an escape hatch.

To that end, I'd like to clarify: in the worst case scenario, a manual accessor written via pattern matching would still work, right? E.g.

data T = T { x :: forall a. a -> a }

x :: T -> (forall a. a -> a)
x T { x = r } = r

Co-Authored-By: Arnaud Spiwack <arnaud@spiwack.net>
@nomeata nomeata merged commit 6c52ff7 into ghc-proposals:master May 3, 2020
@nomeata nomeata added Accepted The committee has decided to accept the proposal and removed Pending shepherd recommendation The shepherd needs to evaluate the proposal and make a recommendataion labels May 4, 2020
@TheMatten
Copy link

What are current thoughts on polymorphic updates? Having classes like:

class GetField (s :: Symbol) a where
  type Field s a
  getField :: a -> Field s a

class SetField (s :: Symbol) x a where
  type Updated s x a
  type Updated s x a = a
  setField :: a -> x -> Updated s x a

class    (GetField s a, SetField s x a) => HasField s x a
instance (GetField s a, SetField s x a) => HasField s x a

It would easy to implement/derive them, use-cases with monomorphic fields would only need to mention concrete field type in instance head and we could have read-only "virtual" fields constructed from values of other fields without having to think about setting in any way.

@mageshb
Copy link

mageshb commented May 9, 2020

Is it possible to use newline to chain multiple selector application,

let r = val
           .lenghtyFieldName1
           .lengthyFieldName2

Would the above be equivalent to the following under this extension

let r = val.lenghtyFieldName1.lengthyFieldName2

@TheMatten
Copy link

TheMatten commented May 9, 2020

@mageshb Not yet - for now you'll have to do something like

let r = ((val
      ).lengthyFieldName1
      ).lengthyFieldName2

using parens to keep dot always next to expression on left side.

@adamgundry
Copy link
Contributor

@TheMatten on the topic of enhancements to HasField, recent discussion has been on #286 regarding splitting it into two classes and (mostly but not exclusively) on #158 regarding polymorphic updates. The problem here is that there are a lot of variant designs, but a lack of consensus around which to choose. Perhaps a new proposal for adding polymorphic update might help things along.

As I've been implementing the HasField design from #158 I've been wondering if it would make sense to offer both mono-HasField and a generalised HasField' class supporting type-changing update. We could use the same underlying function for generating the dictionary in both cases, with the constraint solver specialising it as needed. Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism, and optics libraries would similarly have a choice of which to use.

I haven't yet been convinced that splitting HasField into two classes is worth doing, because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

@effectfully
Copy link

@TheMatten

What are current thoughts on polymorphic updates?

I wrote a post on this topic some time ago. It describes all the known approaches (if something was left out, please let me know) and compares them. The one you've outlined is basically the worst (after having no polymorphic update at all).

@adamgundry

The problem here is that there are a lot of variant designs

There is the bad type family approach, the good functional dependencies approach and the novel SameModulo approach. I don't think anybody is going to commit to the novel approach, given that it doesn't offer much more than the functional dependencies approach, which is simpler and has been around for ages. So in my view there's only one serious contender: the FunDep approach.

offer both mono-HasField and a generalised HasField' class supporting type-changing update.

Please swap the names then, many people are used to Lens' being monomorphic and Lens being polymorphic.

Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism

Why?

I haven't yet been convinced that splitting HasField into two classes is worth doing, because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

Having

data Person = Person
  { name :: String
  , age  :: Int
  } deriving Show
data Company = Company { owner :: Maybe Person }
  deriving Show

with monolithic HasLens:

incOwnerAge company = company{owner = fmap (\y -> y{age = succ y.age}) company.owner}

with split HasLens:

incOwnerAge company = company.just.owner { age = succ age }

(note sure if x.y.z { a = ... } syntax is supported though, haven't been paying attention)

because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

Even just treating sum-types as set-only is already useful and doesn't require any indications. And having a way to indicate something is only a matter of settling on syntax: just about 500 comments on a proposal and you're done.

@adamgundry
Copy link
Contributor

@effectfully

What are current thoughts on polymorphic updates?

I wrote a post on this topic some time ago. It describes all the known approaches (if something was left out, please let me know) and compares them.

Nice, thanks for this! I somehow lost track of it at the time, but this looks very comprehensive and is really helpful as a comparison of the approaches.

The problem here is that there are a lot of variant designs

There is the bad type family approach, the good functional dependencies approach and the novel SameModulo approach. I don't think anybody is going to commit to the novel approach, given that it doesn't offer much more than the functional dependencies approach, which is simpler and has been around for ages. So in my view there's only one serious contender: the FunDep approach.

I haven't thought about it as much recently, but I generally agree. Someone should write a proposal to use the FunDep approach. Any volunteers? Or maybe I'll get to it eventually...

Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism

Why?

Well, it's a complexity trade-off; dropping type-changing update gets you simpler inferred types. It's not completely obvious that having two HasField classes is better than just the more general option, but it should be relatively cheap to implement.

with split HasLens:

incOwnerAge company = company.just.owner { age = succ age }

(note sure if x.y.z { a = ... } syntax is supported though, haven't been paying attention)

I think you'd need general modification syntax for that (something like company { owner.just.age & succ }, which isn't part of the accepted RecordDotSyntax proposal. With the proposal as it stands you could have company { owner.just.age = succ _ } but then can't fill the hole as company.owner.just.age isn't gettable.

Or you could use optics:

incOwnerAge company = company & #owner % _Just % #age %~ succ

Once you get beyond simple (nested) field selection and update, I'd suggest using lenses/optics directly, rather than trying to extend record operators with pseudo-fields like just.

FWIW I consider HasField primarily an implementation detail of the various "syntaxes for record manipulation" provided by optics libraries and RecordDotSyntax, rather than something that end users would be expected to access directly or define APIs with. That's why I favour the s -> (b -> t, a) representation (simple to construct and to convert into a lens, even if it doesn't compose as well). But it also motivates keeping HasField relatively limited in scope: just the automatic constraint solving based on existing record definitions, rather than some more general notion of stringly-named field-like things.

@navid-zamani
Copy link

What I really do not understand, and should be found here, in case others like me read this to find the reasons:

Why in the world wasn’t another symbol chosen? Why the (unspaced) dot (.), of all things? Why force significant space characters where a simple typo of a literally invisible single space completely changes the meaning of the code? Why make Haskell now literally impossible to write with a pen on paper, where one can’t tell if the space is just space or a space?

I seems the reason for releasing something that is known to be at best a stopgap, is that the alternative would be to keep bickering forever. I don’t know what that holdup was, but that argument is fallacious because ignores the utter triviality of solving the above problem:
Just choose a better symbol!(TM) ^^
I would have just used whatever symbol remains unambiguous with regard to spacing, if you take the ANSI characters from 0x20 upwards and remove all the ones that are already used in that context in Haskell.

The weirdest thing is, how this elephant in the room is never even mentioned anywhere I read. It cannot be that nobody but me ever thought of this, but it can also not be that everyone orchestrated to make even naming that solution a taboo like speaking of the devil. So… did people like me miss an important memo? In that case, please, add that memo to the answer for the above.

This would help people like me a lot with understanding how any of this can be a reasonable best choice even for a temporary stopgap, especially in the face of the dangerous threat of falling down the local energy minimum of "good enough", where the motivation left to actually do it right is too weak to fix it properly. As they say: Nothing more permanent in the world, than a kludge. ;)

Or, alternatively:
Just choose a better symbol!(TM) ^^

I, for one, am already preparing a patch for my own installation of GHC, named RecordSelectorSyntax, that treats the selector as just a normal function and lets you define the symbol to use by importing it under a different name. E.g.:

import Portage (recordSelector as (»))
(Preferably, there should be no reserved syntax at all except declaration (=) and some form of escaping. Only overridable defaults.

TL;DR: In any case, the way this was communicated down the stream, to us end users, was a train wreck of KDE4.0 or Firefox Fenix proportions. If I didn’t know SPJ et al to have such a highly respectable track record of wise decisions, I’d think they’d lost all their marbles and gone full PHP.JS. ;)

@Vlix
Copy link

Vlix commented Mar 4, 2022

AFAIK the dot is used because of a few reasons:

  • It is how many other programming languages select a field from an object/dictionary/record, and that has merit when learning the language
  • There's precendent with qualified functions (i.e. Data.List.nub) where the dot is also used, and ALSO in a space-sensitive way (so dots being space-sensitive is nothing new)

Furthermore, everyone using their own selector would make reading other people's code potentially pretty confusing. And if ALL syntax can be switched for whatever, that would potentially make reading other people's Haskell code a nightmare, because you have to keep checking the syntax is still what you think it is ALL the time.

All in all, I'm pretty happy with the implementation and would like to enforce the use of spaces around operators anyway :) Also helps with readability IMHO.

@AntC2
Copy link
Contributor

AntC2 commented Mar 6, 2022

@navid-zamani this proposal was accepted over a year ago, and is released already (9.2.1 October last year). So I think commenting here is not the place. Suggest you start a thread on Discourse.

Contra your gloom-laden predictions, I haven't seen the sky falling since October:

  • None of this change to parsing . happens unless you switch on -XOverloadedRecordDot. If you don't like this extension, don't use it; all the previous ways to access records still work.
  • "a simple typo of a literally invisible single space completely changes the meaning of the code" is not a risk, I think: with such a typo, you'll get an ill-typed program, not a valid one with a different meaning. If you have an example of something other than GHC rejecting a program, please post an issue.

Haskell now [gratuitous emphasis deleted] impossible to write with a pen on paper

Most of us use a computer to write Haskell code. There was already in H98 several places where spaces around . made a difference:

3.14                -- not the same as `3 . 14`                  -- which is ill-typed
Prelude..           -- not the same as `Prelude . .`             -- which is a syntax error 
Module.ModSub.name  -- not the same as `Module . ModSub . name`  -- probably rejected

How are you distinguishing those with your quaint technology?

Just choose a better symbol

Probably every possible symbol is already used in somebody somewhere's code. So would you like to explain to them why you want to break their code. The advantage of . is that already Haskell treats it specially, even though it's also just (ha!) an operator in the Prelude.

And what @Vlix said.

@navid-zamani
Copy link

Thank you for these very reasonable and level-headed replies.
While I still do not think it is the best solution, surprisingly, it is now fine with me.
I think the replies will server well for people looking for an answer like me. (Because all such searches lead here.)

The Haskell community proves to be the best again. :)

(Now all I wish is for the QWERT* layouts to die a deserved death. … Using NEO 2.0 over here, think German Dvorak but a thousand times better. ;)

--

@AntC2: „Contra your gloom-laden predictions, I haven't seen the sky falling since October“ … Well, piecemeal tactics (what I mistranslated as “slippery slope“) always sneak in unnoticed, bit by bit, until it’s too late., That was kinda my point there. … But you followed it with good arguments, so in return for them, that “This is fine“ emotion-laden statement is forgiven aswell. :)

@HugoPeters1024
Copy link

HugoPeters1024 commented Mar 21, 2022

For my projects I always add an instance that lifts the HasField typeclass to a Maybe. Do people think it is a reasonable idea to add this to the standard library? In the extreme case it can even be done for any functor.

instance HasField (field :: k) r t => HasField field (Maybe r) (Maybe t) where
    getField = fmap (getField @field)

Although perhaps a different operator is warranted.

@ocharles
Copy link

This is a tricky one. I also do this for some functors (in particular reactive-banana's Behavior and Event feel like very natural places to add such an instance). For Maybe though, I'm not so sure...

@phadej
Copy link
Contributor

phadej commented Mar 21, 2022

@ocharles you are right to be suspicious

instance HasField (field :: k) r t => HasField field (Maybe r) (Maybe t) where
    getField = fmap (getField @field)

works for getField, but doesn't work so well for hasField :: r -> (a -> r, a) (it would for representable functors, which Behavior kind of is, but Maybe isn't).

@parsonsmatt
Copy link
Contributor

Requiring that you have a Setter for any Getter is a pretty bad design decision - can we please keep them as separate concerns?

@tomjaguarpaw
Copy link
Contributor

@parsonsmatt I believe there was a fair amount of discussion of that, for example at #286

@ocharles
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted The committee has decided to accept the proposal Implemented The proposal has been implemented and has hit GHC master
Development

Successfully merging this pull request may close these issues.

None yet