Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linebreaking support #127

Open
fred-wang opened this issue Jul 5, 2019 · 24 comments
Open

Linebreaking support #127

fred-wang opened this issue Jul 5, 2019 · 24 comments

Comments

@fred-wang
Copy link
Contributor

cc @bfgeek

The MathML core spec now defines all the min-content / max-content values, however these two are equal and linebreaking is supposed to never happens.

I believe linebreaking could potentially happen in:

  • Inline equations. However, I think defining CSS fragments will be a bit complicate.

  • mrow-like element from display equations. We have an experiment like this in chromium. Currently that depends on the parent's width however, we could do something similar if width is specified on the element (Generate new PNGs for examples in the core spec #45).

  • mtable (I guess that's already defined by CSS). Maybe we want to prevent it for now.

MathML 3 rules for linebreaking are quite complex, maybe we should have a simple version first and refine it later or get it improved by polyfills when the CSS Layout API is ready.

Just opening this so that it can be referenced from the spec.

@fred-wang
Copy link
Contributor Author

We need to investigate a bit this, but I imaging we could introduce a math-wrap property (or rely on an existing CSS one) in the future, which would default to nowrap if we are concerned about backward-compatibility change.

@fred-wang
Copy link
Contributor Author

Gecko disables linebreaking in table cell:
https://dxr.mozilla.org/mozilla-central/source/layout/mathml/mathml.css#137

WebKit disables linebreaking in foreign content:
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/css/mathml.css#L134

I guess we could set white-space to nowrap by default on MathML elements to prevent any backward-compatibility issue.

@fred-wang
Copy link
Contributor Author

It would probably interesting to start experiment line breaking with the CSS Layout API ( https://drafts.css-houdini.org/css-layout-api/ ) maybe starting with basic tests and then trying write a mrow-like layout.

In any case, I think #123 should be resolved first.

@ronkok
Copy link

ronkok commented Dec 6, 2019

Line breaks are more important in the mobile-screen world than they were years ago. I think this should get a high priority.

The TeXbook, page 173 states that a "A formula will be broken only after a relation symbol like $=$ or $<$ or $\rightarrow$, or after a binary operation symbol like $+$ or $-$ or $\times$, where the relation or binary operation is on the ``outer level'' of the formula (i.e., not enclosed in {...} and not part of an \over construction)."

KaTeX does its best to emulate the TeXbook rule. For my own work, this issue the is single thing that will cause me to use use KaTeX HTML rather than MathML.

@NSoiffer
Copy link
Contributor

NSoiffer commented Dec 8, 2019

The rules that TeX has for linebreaking only apply to inline math. TeX categorizes symbols into a small number of categories (to fit in four bits because efficiency was critical at the time of its design). This means that there are broad generalizations that work a lot of the time but not always. E.g., (a+b)⋅(a-b) might break at the + or -, which would be a poor place to break.

Generally, in linebreaking you want to look at the expression tree and break as close to the root as possible while still filling as much of the line as possible. Typically, relational operators will be at the root of a tree, then lower precedence operators like + and -, then higher precedence operators like ⋅, ⨯, and /. Well-structured MathML (which is unfortunately not very common) has mrows that align with the expression tree so knowing where is a good spot to break is easier with well structured MathML. MathML's operator dictionary gives priorities of operators that can be used for parsing and linebreaking, and serve as a guide for spacing also (higher priorities have less space around them in general). MathML 3 lists a potential linebreaking algorithm that takes time proportional to the number of token elements times the number of lines, so it is relatively quick and does a pretty good job. A more complicated version that looks at the whole expression vs a single line would mimic TeX's paragraph linebreaking rules. I would like to see the simpler algorithm become part of the core spec, but I understand we need to have priorities and this might add several weeks of implementation and spec time if cleanly doable at all with the current state of CSS.

In response to earlier comments:

  1. Linebreaking inside of tables/matrices is complicated if you want to do a good job. That's because if the width of the table cell is computed automatically, you really want to do that knowing how good/bad that width is for linebreaking the expression. If there is a fixed column width, then it is no more complicated than normal linebreaking. In my former job, we had a publisher that had two column layout and put math inside of tables in each column, so they were very concerned about good linebreaking. The point of this comment is that allowing linebreaks in math in tables will address real needs of publishers. On phones, those issues will be there for any math in a table.

  2. In addition to linebreaks, for display math, one needs to deal with indentation. As with linebreaks, indentation levels reflect the expression tree and help readers understand what is grouped with what.

  3. Linebreaking does require another pass over the layout once sizes have been determined, but it is not rocket science to figure out good linebreaks, at least conceptually. Whether it fits in with the CSS layout model is (I think) the main question. The CSS Layout API mentioned earlier seems promising with indentation maybe done by left-padding each line.

  4. Any polyfill that did linebreaking would cause reflow, which would be bad. It would have to come after layout is done. Potentially a polyfill could be written that creates well-structured mrows so that the browser implementation has an easier time doing linebreaking.

  5. MathML 3 has a number of manual linebreaking and indenting options that can set on mo. Maybe the first step is for the core spec to specify those. That would at least allow an author to get some linebreaking/indentation to happen so that (for example) an expression with multiple = signs can be broken at the =s and aligned.

@fred-wang
Copy link
Contributor Author

I think there are two important points in Neil's reply:

  • We first need to focus on properly implementing a CSS-compatible and interoperable math rendering in all browsers, which is what we are doing with MathML Core. This must be incremental and already starting to introduce a lot of advanced features before the first step is done is counter-productive if we want to keep the current support we've gotten from browser vendors or CSS WG people. Everybody definitely agree that linebreaking is an important property of text/CSS/math layout, though.

  • The idea with MathML Core is that you can just use normal CSS/JS technologies as for other HTML elements. So if KaTeX or other polyfills are able to do linebreaking with HTML elements they could just follow similar approach with MathML elements. With the current technologies available (e.g. without CSS layout API) that would be bad for performance because it would require forcing relayout but it's probably not that bad compared to what these polyfills are doing right now (which requires to do the full math layout themselves). So I think it could be an interesting experiment to try in the short term.

@ronkok
Copy link

ronkok commented Dec 8, 2019

This must be incremental

Agreed and acknowledged. The work being done is excellent.

MathML 3 has a number of manual linebreaking and indenting options that can set on mo.

Yes, MathML 3 contemplates an attribute of linebreakstyle on a <mo>. It would be great if this were to be specified and actually implemented, unlike in current Firefox.

The idea with MathML Core is that you can just use normal CSS/JS technologies as for other HTML elements.

If I understand that statement correctly, then one could apply an inline style of display: inline-block to a top-level <mo> and it would act like just like a <mo> with a linebreakstyle="before" attribute. That would also be terrific and would be all that I ask.

Do I understand that statement correctly?

@ronkok
Copy link

ronkok commented Dec 8, 2019

So if KaTeX or other polyfills are able to do linebreaking with HTML elements they could just follow similar approach with MathML elements.

A similar approach would break the top level into multiple mrows, with each break occurring at a binary or relational operator. That would map a + b = d into:

<mrow><mi>a</mi><mo>+</mo></mrow>
<mrow><mi>b</mi><mo>=</mo></mrow>
<mrow><mi>d</mi></mrow>

This approach would create automatic line breaks in the TeXbook locations. It works, at some cost to the semantics.

As suggested, KaTeX could implement this method. It is very similar to what is now done in HTML. If the method in the previous statement will not work, the method in this comment is probably what we will do. Let me know where we stand.

@fred-wang
Copy link
Contributor Author

So if KaTeX or other polyfills are able to do linebreaking with HTML elements they could just follow similar approach with MathML elements.

A similar approach would break the top level into multiple mrows, with each break occurring at a binary or relational operator. That would map a + b = d into:

This is the short term approach I was thinking about. You can use getBoundingClientRect() to know position and size after layout in order to apply line breaking depending on the screen size. If semantics is a problem, note that you can put these splited MathML pieces into a shadow tree so that the original MathML DOM is still available.

@ronkok
Copy link

ronkok commented Dec 8, 2019

@fred-wang Thank you for the quick response. That clears up the picture considerably.

@fred-wang
Copy link
Contributor Author

@ronkok No problem. Additionally, note that you can use https://developer.mozilla.org/en-US/docs/Web/API/ResizeObserver to watch when the width of the container of the <math> tag changes in order to update linebreaking(e.g. when the user resize the window.

@ronkok
Copy link

ronkok commented Dec 8, 2019

KaTeX avoids, so far, any reliance on the browser for runtime information. It generates code that works whether generated client-side or server-side. So I think we're stuck with the multiple <mrow> approach.

@fred-wang
Copy link
Contributor Author

consensus from 2020/06/23: postpone to a future version

@NSoiffer
Copy link
Contributor

Since it is not in this issue and might prove useful to a future core implementation... I implemented a linebreaking polyfil back in 2020 (seems like a lifetime ago...). You can see it in action on github.io. Click on Apply Transform to see it work (if you have Chrome/Edge, the MathML display needs to be on).

This transform makes use of one column mtables as its target because there is currently no other way to get multiple lines to show up in the implementations. If core supported a manual linebreak (i.e., if <mo linebreak='newline'> is supported and forced the start of a new line), then this polyfill could take advantage of that and it would be much less intrusive in what it currently does to the MathML by adding an mtable.

Note: indentation is done using mspace and that same idea would carry forward to a version of core that supported a manual linebreak.

@dginev
Copy link

dginev commented Jun 26, 2023

Since we have a prolonged gap period here, are there any current recommendations for pure CSS solutions for reflow?

I took a stab at switching the display of a simple equation to inline-flex with a corresponding @media query for small viewports, and it seemed to behave quite reasonably for a failsafe (in FF and Chrome).

Here is an example of that (with flex always on). The demo should be able to render 6 different arrangements as the screen shrinks:
https://codepen.io/dginev/pen/rNQjdzR

It would take some more fine-tuning to control the finer details of reflow, but this could already be a healthy upgrade for common equation markup.

@ronkok
Copy link

ronkok commented Jun 26, 2023

I can confirm that a flex-based solution works pretty well. You can see it already in action if you navigate to Temml.org and turn display mode off.

In default mode, Temml writes MathML with <mrow> elements that each end in a binary operator or relation operator. (Per The TeXbook p. 173) Then the <math> element carries the following CSS:

/* flex-wrap for line-breaking in Chromium */
math {
  display: inline-flex;
  flex-wrap: wrap;
  align-items: baseline;
}
math > mrow {
  padding: 0.5ex 0ex;
}

/* Avoid flex-wrap in Firefox */
@supports (-moz-appearance:meterbar) and (display:flex) {
  math { display: inline; }
  math > mrow { padding: 0 }
}

I don’t apply flex-wrap in Firefox. In Firefox, the separation by <mrow> elements already works without a flexbox.

Temml has a rendering option which allows a website administrator to select breaks before = signs instead of after binary operators. The CSS remains the same, but Temml generates MathML with differently grouped <mrow> elements.

In the comments above, I was one of those asking for line-breaking action in the MathML Core specification. At this time, I do not make that request. In Chromium and Firefox, I think line-breaking is largely a solved problem. Chromium and WebKit have much bigger rendering problems to solve.

Sadly, flex-based line-breaking does not work in WebKit. Maybe someday.

@dginev
Copy link

dginev commented Jun 27, 2023

@ronkok This is great, thank you for the extra context.

But I wouldn't go beyond calling this a "stopgap solution", since you've enumerated some serious problems. WebKit lacking support is one, another is ending up with two non-standard solutions for Chrome and Firefox.

Not having a standard way to manually force (or softly suggest) a linebreak in the usual MathML markup is a third. Flexbox allows a variety of techniques to support what Neil referred to as <mo linebreak='newline'>, but none seems to work on an element with pre-set content. Instead, with flexbox we seem to need a dedicated empty element to indicate the forced break (similar to the empty <br> in HTML). Here is an example of the best I could come up with. Since it had to be empty, I used <mspace>):
https://codepen.io/dginev/pen/zYMNjgd

In summary:

<mrow> ...LHS... </mrow>
<mo>=</mo>
<mspace class="linebreak"></mspace>
<mrow> ...RHS... </mrow>
mspace.linebreak {
  flex-basis: 100%;
  height: 0;
  width: 0;
  overflow: hidden;
}

Update: In cases where we have a <semantics> wrapper to also hold Content MathML under the root math element, the inline-flex approach doesn't appear to be possible in Firefox. So it's a Chrome-only trick at this point.

As an alternative (and maybe even worse) trick, one could rearrange the <mrow> structure by tucking in the equal sign in the left-hand side mrow, which will have them treated as a single flex item. But that would break the default spacing support available via the operator dictionary. So I think inline-flex "mostly working" is a nice surprise, but still appears to be a crutch to ducktape some reflow together, until we have a proper mechanism.

@bkardell
Copy link
Collaborator

In Chromium and Firefox, I think line-breaking is largely a solved problem. Chromium and WebKit have much bigger rendering problems to solve.

One of these is a typo, I suppose? The second one?

@ronkok
Copy link

ronkok commented Jun 27, 2023

To clarify: Temml's <mrow> trick works pretty well at providing non-display mode line-breaking in both Chromium and Firefox. It does not work in WebKit.

On the more general question, I have begun compiling a list of browser issues. Firefox is the best, by a large margin. Chromium has some serious issues, especially if system fonts are used. WebKit is the worst. It cannot even render an accent at the correct vertical alignment.

I put a lot of work into the Temml library. It is my sincere hope that someday, MathML will get widespread use. But I think that day is not yet. I suspect most web site administrators will avoid MathML until browser rendering is more reliable.

@NSoiffer
Copy link
Contributor

NSoiffer commented Aug 3, 2023

@ronkok: I'm a little late to the game, but I finally tried temml.org in Chrome and it is great to see flex-based solution works. But putting aside it currently only works in Chrome/Edge, I would add it is only half the solution. Not only does an expression need to wrap long lines, it needs to indent them appropriately. Is that something that can be done with flexbox?

The polyfill I mentioned illustrates why that's important. But it is hacky in that it had to create an mtable. I hope core level 2 will have at least the hook mentioned.

@ronkok
Copy link

ronkok commented Aug 3, 2023

@NSoiffer I'd like to look at your polyfill in action, but I'm getting a 404 error when I click on that link.

I'm still thinking about how to get an indent. I have a version working in a web app I call Hurmet.app. It renders math with Temml's version of MathML and I have it set to wrap before top-level = characters. Here is a screenshot of how that looks:
Math wrap

To get that indent, I have appended the following CSS.

/* Create a hanging indent on calculations that wrap to a second line. */
.hurmet-calc > math > mrow:not(:first-child) { margin-left: 2em }
.hurmet-calc > math > mrow:not(:last-child) { margin-right: -2em }

That's pretty hacky and I have applied it only to my own site, Hurmet, not to the library Temml. I'd like better control over the width of the indent. I'd like something less odd. But it will serve as a temporary line breaking solution. I agree with both you and @dginev that this is a temporary fix and would benefit from better support in MathML-Core someday.

But there several other rendering issues that I think should get higher priority.

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Aug 3, 2023 via email

@davidcarlisle
Copy link
Collaborator

@NSoiffer @ronkok fixed at
https://w3c.github.io/mathml-polyfills/acid-test.html

@ronkok
Copy link

ronkok commented Aug 3, 2023

Thank you for the link to the polyfill. It's nice work.

Temml is written to run either client-side or server-side. It therefore does not have access to document.getElementById() and cannot use the techniques in the polyfill. Temml line-breaking is a CSS solution.

There is room in the world for both CSS solutions and JavaScript solutions. Hopefully, one day the browser will have a native solution and neither will be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants