DevMeeting at RubyKaigi 2024

Proposal from Matz

  • Syntax moratorium
    • Matz will be declaring a syntax moratorium in his keynote for about a year or two.
    • This is to help parser maintainers, so that they don't have to keep playing catch-up.
    • Do not share this with anyone else until the keynote.
    • There are a few syntax change proposals in-flight, we need to address them first.
    • Brandon: What about pattern-matching syntax changes that are in-progress (find pattern & ^(expr))?
      • Matz: Those will go in. Prism should support them.
      • Benoit: Prism already supports them today.
    • mame: Prism has almost caught up with the current Ruby syntax successfully, but is a moratorium really needed? For the Prism compiler?
  • Prism compiler potentially as default for Ruby 3.4
    • Ufuk: 35 tests not yet passing, rest works on Prism.
    • Matz: We need to check Prism against real-life code to make sure it works correctly. But good progress on Prism compiler.

[Bug #20468] Segfault on safe navigation in for target (nobu)

  • Also constants are valid?

Preliminary discussion:

  • Currently for foo&.bar in []; end causes segfault.
  • Nobu asks what should the result be for this expression?
  • Kevin N says there are more cases.
  • What should the spec be?

Discussion:

  • Matz: It should be accepted and ignored.
  • Nobu: For constants it doesn't make sense to use them in for loops
    • Matz: But it is still valid syntax.
  • Nobu: So will the semantics be like zverok's code example in the issue:
    ​​​​for temp in [1,2,3]
    ​​​​  foo&.bar = temp
    ​​​​end
    
  • mame: for a.b.c in [1, 2]; end should evaluate a.b twice?

Conclusion:

  • Matz: We will fix the bug, but no syntax change.

[Feature #19979] Allow methods to declare that they don't accept a block via &nil (ufuk)

Discussion:

irb(main):001> def foo(**nil); end
=> :foo
irb(main):002> method(:foo).parameters
=> [[:nokey]]
  • Benoit: What if iseq compilation was lazy? Then we wouldn't know if yield until called (unless tracked in parser maybe).
  • Benoit: what if the detection of used block can only be done when the method is called and not at parse/compile time?
  • headius: Warning will always miss eval cases, so I am in favour of static annotation. Also zsuper do we need to detect.
  • Koichi: I don't like &nil syntax, but I find it is quite hard to tell at runtime.
  • byroot: What if it was a method annotation, instead of new syntax? Something like:
    ​​​​ noblock def foo(a, b)
    ​​​​ end
    

Conclusion:

  • Timeout

[Feature #15554] warn/error passing a block to a method which never use a block (ko1)

  • should we continue with "relaxed" warning?
    • strict: warning any methods which may not use a passed block
    • relaxed: warning if there is no method which has same name and it may use given block.
class C def foo = nil end class D def foo = yield # unrelated to C#foo end C.new.foo { assert false } # warning on warning strict mode # do not warn on relax mode
  • byroot: strict 90% false positive in rails/rails test suite, relaxed brings it down to 10% false positive, and almost no false negative.
  • Benoit: adding &b to duck typing methods seems helpful for readability and indicate "intentionally ignores block". But too much warnings is bad, maybe opt-in via flag/env var/pragma?
    • Matz: makes sense to enable for some codebases if people want
    • env var? command line option? (--strict-unused-block-warning / --pedantic-warnings ?)
    • akr: Or new warning category? pedantic category that gives more warnings?

Conclusion:

  • Matz: Give me your comments in the issue.

[Feature #20470] Extract Ruby's Garbage Collector (peterzhu2118)

  • Splits GC into two files gc.c and gc_impl.c.
  • gc.c only contains code not specific to Ruby GC.
  • gc_impl.c contains the implementation of Ruby's GC.
  • gc_impl.c only uses public APIs in Ruby and a limited set of functions exposed in gc.c. This allows us to build gc_impl.c independently of Ruby and plug Ruby's GC into itself.

Discussion:

  • Peter: Describing the change and why it is being proposed.
  • Matz: I am neutral. Separating the implementation is generally a good thing. My concern is: Can we define a complete set of API for GCs?
    • Peter: Currently this is not complete at all. You can implement a simpler GC than Ruby's GC, but not more complex. This will evolve going forward.
  • Koichi: Can we clarify what GC API means? Is it for external GC implementations? I am concerned with how future changes will happen. For example, what if we want to introduce Ractor-local GC in the future, will it be possible?
    • Peter + Aaron + Matt: The API is not "stable", it is OK to break the API. What we want to introduce is some interface to separate implementation, but there is no expectation for it to not change.
  • Nobu: This kind of entry points can be security vulnerability, since it allows injecting a shared library into Ruby.
    • Peter: This is a feature that needs to be turned on at compile time, and then inject a dylib at runtime, so anyone using this should know the security implications already. Default behaviour is no injectable GC.

Conclusion:

  • Pending further discussion
  • The relationship between regexp encoding, file encoding, string encoding, and matchee encoding is very confusing.
  • A migration path is proposed in the ticket.

Discussion:

  • Matz: It should be removed in the future. The problem is the timing.

Conclusion:

  • We can start deprecation process

[Feature #20415] Precompute literal String hash code during compilation (byroot / etienne) (byroot)

  • Simple optimization that closes most of the performance gap between symbol indexed hashes and string indexed ones for string literals.
  • Multiple ways it could be implemented, either can be done for all string literals or just the ones that have enough free space.

Discussion:

  • byroot: Explain the ticket. Real performance gains in liquid-c and rack benchmarks.
  • nobu: Did you address the unaligned read problem?
    • byroot: yes, we applied your suggestion.

Conclusion:

  • Accepted

[Feature #6648] Provide a standard API for retrieving all command-line flags passed to Ruby (eregon)

Discussion:

  • Benoit: Does anyone have any opinions about the proposed API?
  • byroot: My first instinct would be to add it to Process instead of RbConfig, but it is not that important.
    • Benoit: RbConfig already has .ruby method to return running Ruby path, so it makes sense to add .ruby_args method as well.

Conclusion:

  • Benoit: we need some decision on the method name & on which module.
  • Replied on ticket, matz needs to respond (TODO)

[Misc #20406] Question about Regexp encoding negotiation (andrykonchin)

  • It isn't clear enough how a Regexp encoding (what Regexp#encoding method returns) is calculated in case an encoding modifier (e.g. u, e, etc) is specified.
  • The documentation states that a Regexp with only US-ASCII characters has US-ASCII encoding, otherwise a regular expression is assumed to use the source encoding. This can be overridden with encoding modifiers.
  • But these rules seem don't work in the following example: /#{} a/e - it's supposed to have EUC-JP encoding but actually has US-ASCII.

Discussion:

  • Benoit:
    ​​​​ p /\xc2\xa1/e     .encoding # EUC-JP
    ​​​​ p /#{ }\xc2\xa1/e .encoding # EUC-JP
    
    ​​​​ p /a/e            .encoding # EUC-JP
    ​​​​ p /a #{} a/e      .encoding # EUC-JP
    ​​​​ p /#{} a/e        .encoding # US-ASCII
    
  • Benoit: Is this a bug? If this is intentional, what is the rule?

Conclusion:

  • naruse said encoding flag seems ignored, will reply on ticket (TODO)

[Misc #20407] Question about applying encoding modifier to an interpolated Regexp (andrykonchin)

  • It isn't clear enough how a Regexp encoding (that Regexp#encoding method returns) is calculated in case an encoding modifier (e.g. u, e, etc) is specified to a Regexp with interpolation (e.g. /a #{ "b".encode("windows-1251") } c/e).
  • The encoding modifier might be applied in some cases and might be not in other ones what seems confusing at first glance.
  • It seems result depends on a source encoding, characters of the Regexp literal fragments and encoding of Regexp interpolated fragments as well.

Discussion:

Conclusion:

[Bug #20421] String#index and String#byteindex don't clear $~ when offset > size (or bytesize) (andrykonchin)

  • It seems String#{index,rindex,byteindex,byterindex} methods when called with Regexp argument and offset out of scope should clear the $~ global variable.
  • But String#index and String#byteindex don't clear $~ when offset > size, only when offset < -size.

Conclusion:

[Bug #20416] IO#read doesn't preserve buffer encoding if maxlen = nil (andrykonchin)

  • IO#read and similar methods when called with buffer argument preserve its encoding.
  • But IO#read doesn't do so in case the maxlen argument is nil.

Discussion:

  • We should ask matz if intended (TODO)

Conclusion:

[Bug #20319] Singleton class is being frozen lazily in some cases (andrykonchin)

  • When an object becomes frozen (with #freeze method call) only its own singleton class becomes frozen immediately.
  • Classes in the singleton classes chain become frozen only when they are returned to user with #singleton_class method call.
  • The object's singleton class' singleton class (and so on) doesn't become frozen even if it was already instantiated
  • This lazy freezing can be observed by a user when he gets a singleton class of the object's singleton class before freezing the object. After freezing the object the instantiated singleton class is still not frozen.
  • There might be several options (1) don't change anything, 2) freeze all the instantiated singleton classes in a chain immediately or 3) don't freeze singleton classes in a chain at all and stop freezing even an object's singleton class) and there is a PR with fix (https://github.com/ruby/ruby/pull/10245) but it's unclear what option is the best one.

Discussion:

  • Jeremy: We have a working solution that walks up the singleton class chain and walks back down the attached object chain and freezes the singleton classes up the chain as far as possible. Passes tests but not sure if the semantics are desired. Ask @nobu to review the PR.
  • Chris Salzberg: Is there a concrete example of this causing problems?
    • Jeremy: There is an inconsistency since the singleton class will be frozen but the singleton class of the singleton class will not be frozen, leading to an inconsistency.
  • Benoit: Should we freeze meta-metaclass on object.freeze?

Conclusion:

  • Further discussion needed. Matz wants to think and understand the issue better.

[Feature #20425] Optimize forwarding callers and callees (tenderlovemaking)

  • Introduces optimization to avoid allocations regarding ... callers and callees
  • Stack size for a method like def foo(...) depends on the caller (foo(1,2) has different stack size than foo(1))
  • Ko1 is worried about complex code and incompatibilib ities because of stack size and offered a different solution
  • Aaron thinks Ko1's solution is of similar complexity but moves complexity to GC
  • Can we try the current solution and refactor / revert if there are incompatibilities?

Discussion:

  • Aaron: I have slides for this!
  • Jeremy: I reviewed the code, I think it is a good patch, and that is the change I would have liked to implement.
  • Slides are here

Conclusion:

[Feature #20443] Allow Major GC's to be disabled (eightbitraptor)

  • Introduces the ability to "turn off" Major GC's, so that only minors will run. This is useful for applications that are using Out-of-band GC.
  • Discussion has focussed around the method names, can we make a decision about what the interface should be? Options proposed are:
    • New methods, eg. GC#disable_major, GC#enable_major, GC#need_major?, GC#disable_major_gc etc.
    • Keyword Args, eg. GC#disable(type: major)
  • @eightbitraptor and @byroot prefer new methods that will respond_to?, and are able to be undefined.

Discussion:

  • Matt: Most of the feedback has been about what the API should look like. If you have opinions, they are welcome. I prefer the method based API since it allows to undefine the method or check for its existence.
  • Matz: My concern is what might happen with alternative GC implementations.
    • Matt: That's why the proposal is for a new method, which might be unimplemented when alternative GC implementations are present.
    • Peter: For pluggable GC, I made most GC methods to be implemented in the implementation, so alternative implementations can define them different. Already some methods do not make sense for alternative GC implementations.

Conclusion:

  • Matz: As long as my pluggable GC concern is addresses and cleanly described, I have no objection.

[Bug #20455] rb_errinfo() inconsistent with $! in the caller Ruby code (eregon)

  • Could we make them consistent? Or completely separate?
  • If not, what is the current semantics (in English) and is it something we want to keep?

Discussion:

  • Documentation is misleading, they are almost separate (except when no rescue/ensure on stack):
/**
 * This is the same as `$!` in Ruby.
 *
 * @retval   RUBY_Qnil  Not handling exceptions at the moment.
 * @retval   otherwise  The current exception in the current thread.
 * @ingroup  exception
 */
VALUE rb_errinfo(void);
  • Does anyone use rb_errinfo(), what for?
  • Let's discuss this in details on a blackboard with Koichi another time

Conclusion:

[Feature #20437] Could be the licensing conditions be made less ambiguous?

[Bug #20438][Bug #20439] String format "%\n" and "%\0" does not raise format error

"%\n" has been treated as "%%" since commit:554b989ba162 , probably Tue Aug 6 01:12:32 1996 according to the commit log.

[Feature #20460] Ripper eval option

  • Can we discuss the proposals to make branch maintainers' lives easier, so that we can target 6-7 teeny releases per stable version per year?

Discussion:

  • Ufuk: People may see Ruby as an healthier project if they get more releases

    • 3.3.0 was a good example where people were eagerly waiting for the 3.3.1 release
  • Benoit: How about making x.y.1 release faster with most important fixes, even if that means not backporting every fix?

  • Having too many releases may degrade user experience if they need to upgrade more often or if they need to go through many releases to get a bugfix

  • A more established process may make it easier for feature implementers to submit fixes for teeny versions

  • Some users are concerned by the lack of communication on the status of upcomming releases. A more established process would make it easier for user to know when to expect bug fixes

  • Jean: recent releases had bugs preventing people to upgrade while at the same time being eager to update to benefit from performance improvements

    • As an example, people were very exciting to use 3.3 for the performance but couldn't because of the Fiber on linux-aarch64
    • bug or the YJIT memory leaks

Conclusion:

  • Could branch maintainers explain what would make it easier for them to cut releases?

Extra Agenda Items

kddnewton

  • [Misc #20238] Use prism for mk_builtin_loader.rb

    • This was discussed at a previous dev meeting, but not everyone was present, so I would like to discuss it again.
    • Benoit: improvement on mk_builtin_loader more elegant and complete approach. Can we merge this?
    • naruse: some concerns about maintainability, conversation was stalled
      • using Prism means a newer version of Ruby is required
      • another approach would be to write his own version of mk_builtin_loader
    • Benoit: the change is 48 lines of code. It dumps the AST to JSON so it can be used from Ruby.

    Conclusion:

    • Will discuss with @kokubun when he is here.
  • [Bug #20401] Duplicated when clause warning line number

    • Is it okay to change the warning message? The current phrasing is confusing.

    Conclusion:

    • Nobu is ok with changing the error message.

tenderlovemaking

  • [Bug #20424] ZLib::GZipReader always double allocates strings when passed outbuf, significantly increasing memory usage

    • Lots of discussion on the GitHub PR here
    • What do we need to do to get it merged?

    Discussion:

    • Aaron: Can Nobu review the PR again? The change looks fine.

    Conclusion:

    • Nobu will review again.
    • Done.

eregon

  • [Feature #20331] Should parser warn hash duplication and when clause?

    • Should we keep this useful warning and maybe optionally skip it when memory is constrained, e.g. for picoruby?

    Discussion:

    • Matz: I have no strong opinion on not depending on BigInteger.
    • One idea is to only warn for e.g. 32-bit integer duplicated keys, that should catch most cases and yet keep it simple for MRuby/PicoRuby/etc
Select a repo