owned this note
owned this note
Published
Linked with GitHub
# Gluster Team offsite in BLR - 16th May, 2017
### What will improve Community Contributions?
- Pranith had suggested about using Github questions
- Github has extensive community contributor presence and the "Issues" allow us have a structured method to measure where we are standing at any particular point
- RG: should we look at creating and incetive plan when users contribute to things which are relevant to Gluster (karma points, badges?) Atin: has concerns about this approach and issues should be assessed on severity/importance
- Would moving off RHBZ help? There are specific topics which prevent us from moving off RHBZ (security bugs)
- Amar: today we are chasing people away from Github. Instead doing a hybrid approach enables potential contributors to send a PR in Github and then take it to Gerrit if required. What would it take to make Gluster a more effective project?
Github PR and Gerrit automatic sync.
**ACTION** Amar will be reviewing the merge requests languishing and respond to Nigel's note to arrive at closure on this topic
Atin: Why do not all the maintainers subscribe to Gerrit and receive notification for all incoming patches and use filters?
Aravinda: That is part of the workflow I follow
Amar: **Everyone please do this!**
RG: There quite a few patches where reviews are not followed up or, builds have failed - who is responsible for these?
Amar: The maintainer would look at the patch only when it is concerned. Unless it is a critical path,
Soumya: Except for code contribution; external individuals do not get attention (even with emails to the list).
Nithya: One of the things preventing people from contributing is that they cannot reproduce the issue. They do not understand the code or, how they arrived there. Gluster developers debug and analyse the issue for them to provide a prescriptive approach.
Sunil: The entry path is convoluted. The "search" for Gluster takes you to Github page and there's no high level defined path which can provide the user with adequate information to get started. The ramping up of new contributors requires to be structured.
Kaushal: Most developers come through Github. The initial README needs to be improved (at least snippets).
**ACTION** Sunil to revisit the README and send a patch to Amye
Jiffin: Technical detail in the source code - is there a reason for being that way?
Amar: Because it gets reviewed and approved and having it with the source is ideally good
Nithya: **Code needs comments**.
Shyam (notes copied over from agenda page)
1) Understanding openness around features. IOW, get your thoughts out as github issues
2) Understanding openness around roadmap. IOW, declare what you are working on and what you intend to work on, which results in nice looking github release scope and focus area for the community.
3) Understanding openness around progress of focus areas (which I think Amar's virtual teams would cover).
4) Completeness of any deliverable, feature-spec -> code -> documentation -> release-notes, all using one issue for better trace-ability. We could also talk about blogs and highlights for features as required here. (IOW, also understanding how to leverage issues across commits and repositories)
5) Focusing on the critical things (esp. during a release), like the review backlog, feature backlog, regression failures, testing needs, to help a release ship on time.
### Good Build Conversation
Amar: There have been some discussions on the quality of the product and we in development have to take an action item on collaboration. Has everyone read the email from Nigel called *Good Build*? Show of hands - not many people. [Good Build email](http://lists.gluster.org/pipermail/gluster-devel/2017-March/052245.html)
We run patches and regressions per patch to see if things are fine. We do not *believe* that the coverage is good enough. We can make it similar to downstream and add more test cases there. We can consider making it run nightly and making it public so that it takes master of each day and thus at any given day we know what the status is. So, when we decide on a date to choose/rebase from upstream, we know which master passes these tests (and hopefully use cases). Concerns needed to be discussed **now** and then we can have Shwetha will lead on with what additional pieces need collaboration on.
Nigel: Where are we right now? Shwetha provided a good coverage of basic set of downstream tests. For every test we can verify that with too much effort (list to be linked by Nigel later)... The problem is that when some of the tests fail, there is more work being added upstream and everyone has to look at the failures to assess whether the component or, the test has a bug. Does everyone know how to write Glusto tests?
.t already works (Ravi); Pranith: it is a time problem.
Amar: For tests around brick multi-plexing; we should be able to configure making things mandatory and the combination of tests can be had.
Atin: When Jeff wrote up this patch, the first version had variables. Currently, it is an option which needs to be set explicitly. The other approach (as Jeff does) he mocks the function and in discussions with Nigel it was thought to be feasible that he can set up a nightly run and apply the patch and validate brick mux functionality
Nigel: This needs to be part of a development cycle. Unless we test patches from Shwetha - this is not going to be able to add value in our flow.
Pranith: The main problem is developers having to write more tests
Nigel: The key issue is tests are failing and no one is looking at them.
RG: Do we need a ***Test Maintainer*** Role?
SHP: QE and Devels have to collaborate to understand how to have the best tests to include in the automation.
**ACTION** Need a page to list failures, Nigel has the current responsibility about this.
Amar: GCC provides enough tools and methods to allow code coverage tests. These things should be also considered as part of the nightly
Pranith: Testing things based on line coverages because it apparently provides a false sense of security. Even if the code goes through that branch; it may not test the cases it should be testing. It is important for individuals who know the component to be involved with the testing.
Amar: Aspects like race conditions cannot be part of line coverage
Nithya: Do we have timelines?
Amar: By end of May 2017, we need this running
Nigel: We will have the infrastructure running
Nithya: How easy is it to get access to infra?
Nigel: We'll have DO instances to be handed over to devels for analysis
Nithya: How many tests are there?
SHP: We have coverage on the BVT. (more detail from Shwetha...)
**ACTION** Nigel to set up line coverage test on nightly.
**ACTION** Nigel will create 2 branches for Glusto tests for identifying stable tests
**ACTION** Kaushal to announce to the Gluster developers about the availability of the systems/infra which enable running Glusto tests "locally". This is linked with the lab revamp activity that Kaushal is planning.
**PROPOSAL** Talur: Glusto tests **should** block a release
**ACTION** Amar to be writing up the proposal covering the external things (viz. Coverity issues etc) and send it to the list
Templates to automate and standardize on testing -- Amar's recommendation
### NetBSD - where are we going with this?
Amar: There's a lack of awareness of the number of users of NetBSD and Gluster. There's been a reasonable tail of NetBSD catching various errors (32 bit compat and such). NetBSD has different POSIX compliance and it is important. But should it be the blocker? Given that it is not the major focus area of the project ... where do we stand?
NetBSD should be run nightly but it should not be a blocker. Should it vote -1? It is not a good thing. So, should it be blocker for a release?
Nigel: Debugging NetBSD issues are frustrating
Talur: If the project is going to state that "NetBSD is somewhat supported"; otherwise, we can focus on specific long tail of things which have been waiting for a long time and a few of these have potential to vastly improve the project. Patches that work in Linux, may not work in other *nix
* https://review.gluster.org/#/c/14613/
* https://review.gluster.org/#/c/15358/
* https://review.gluster.org/#/c/14670/
* https://review.gluster.org/#/c/14671/
Poornima: If there are not enough users we should look at the topic of NetBSD "support" in the project
**PROPOSAL** The latest builds in NetBSD is 3.8; to be able to move ahead we provide the way to contribute
**ACTION** Mail to be sent by <insert name/Amar?> to the -devel list proposing a way forward on the NetBSD support/focus
### Deadlines for ACTION items (summary)
Amar: By 10 June the notes on Council; Maintainers; Good Build; Infrastructure; NetBSD will be sent out to the lists
### Gluster 4.x+ Roadmap; Planning and Beyond
Shyam's note on 3.12 and 4.0 scope (gluster-devel) [[Gluster-devel] Release 3.12 and 4.0: Thoughts on scope](http://lists.gluster.org/pipermail/gluster-devel/2017-May/052811.html) - this is based on a discussion and notes off a Google document (release wishlist)
- GeoReplication to cloud --> policy driven and "frozen files"
- not intended to be for VM store use case; sync the snapshot of VM; S3 are object stores and thus inherently not for VM images; mostly for archiving (Aravinda)
- restore is a question the internals of the operation need to be thought out (Sahina)
- Throttling support on server side to manage running self heal processes
- The server can go into "hang" like situation based on current approach; (Ravi) it is client based and not server based
- Brick mux --> better support and more control
- **ACTION** Samikshan to be creating a RHBZ/issue to track all brick mux work
- GFID to path improvements
- Kotresh has a bug
- Resolve issues around disconnects and ping-timeouts
- RG: There's a degree of FUD around the topic - there is a RHBZ from a RHGS customer which indicates that there are specific aspects to work on. Ping timeouts have however been increasing (under load). It would appear that analysis of loads might be a better approach towards understanding the specifics of the topic. The improvements can be ...
- **ACTION** Raghavendra G needs to be chunked out into specific topics (release based) which would help work on the various issues (viz. ping; generation etc https://docs.google.com/a/redhat.com/spreadsheets/d/1ii66it6aZzl6JwEoRIDz5x40EF4D2cJa3jXNvjvi9W8/edit?usp=sharing) *please link the ongoing conversations around this topic*
- HALO with hybrid mode
- 3.11 there was work undertaken; FB has a requirement for a *hybrid mode* (please add definition and link the Github issue). Pranith
- +1 scaling of cluster
- this is a deep topic which spans across a number of topics. Shyam+Amar
- Look-up optimize turned on by default
- RG: needs testing
- SusantP: remove brick does not work on all nodes(?)
- Phase 1 Server side clustering
- the thin client the use case would be about resource consumption (samba and container workload); there is needed to be changes in AHA, libgfapi. FB don't handle ....; Pranith & Poornima
- **ACTION** Github issue is to be created and email to be sent out making the wider community. Pranith to complete by 15Jun
### Github Gluster Organization
General health check on who is not part of it and mass addition by Nigel
### Some Ideas came up earlier
- GFID Path #139
- GFID Type (File type in GFID) #207
- Geo-rep s3 sync #179
- Changelog record rename (old) path #202
- Quota #181, #182, #183, #184, #78
- Brick Split #169
- Monitoring #168, #141, #137
- Events
- glusterd2 (and other quicker/better options as alternatives)
- Python CLI client bindings (has some traction)
- What is the path for other bindings
- DHT2
- Throttling / Noisy Neighbour
- Thin Clients (Server side clustering) (Nick name: gfproxy)
- Trusted client? (Talur to talk about it)
- Snapshot : What next? Who next? #145
- Tiering : What next?
- **ACTION** Amar to sync with Shyam
- Non-root users focus
- SELinux
- **ACTION** Jiffin to consolidate observations (on Github issue?)
- EC
- Support Different work load
- **ACTION** Ashish to create github issue #211
- Any other algorithms
- **ACTION** Ashish to create github issue
* Subdirectory mount #175
- Help to check netbsd failure
- Pending auth-allow/reject work
- ctime generator #208
- Arch and 32bit/64bit compatibility
- dict does not check for arch differences
- many patches blocked because they use dict
- **ACTION** Amar and Poornima to work on this
- Brick-multiplex - What more?
- Testing:
- gcov / lcov in testing #206
- Memory usage tests / improvements #52, #57
- Coverity scan issues
- Address sanitizer to Vote?
- clang check to vote?
* https://github.com/gluster/glusterfs-patch-acceptance-tests/pull/17
* https://review.gluster.org/#/c/11083/
* DECISION : Not to vote, developers free to run tool in repo
- Debug focus:
- SystemTap
- Keep NNN call_stack in memory #173
- Always keep DEBUG/TRACE log in memory #172
- Client List - #178, #161
- Thread Id in log.
- Request id created in Master xlators and passed down, logged.
- Locking enhancements #205
- Mutex lock contention reduction
- Halo - What next? #199
- Rebalance Improvements #197, #155
- Lookup optimized (on by default)
- Transaction library
- (posix-lock transactions)
* Krutika working with Xavi
- Gluster's Internal locks
- libgfapi (also brick multiplex?), global resource pool
- Performance xlators: Dedup cache, integrate with upcall/leases, Use case based profiles, Auto tuning certain parameters.
- Glusterd volume options re-look
- Op-version issues
* Make op-versions global - Kaushal
* fix op-version match to be a function than a simple num check
* volume sets should be specifying multiple op-versions 30813,31003, etc
- Better memory management
- Our own heap management to replace io-buf
### Public Facing Challenges
* [Gap Analysis Document ](https://docs.google.com/a/redhat.com/document/d/1IL6SE9dnhX8MJha7RctCOTiXYd1XlhvaY6OyDCJwX34/edit?usp=sharing)
* Feedback on website design/mock-up?
* WordPress powered site
* integrated into the blog as well
* easy ACLs and permissions
* Developer link into Github landing page (Kaushal)
* ~ last half of June for go-live with the website
* (Raghavendra) at DevConf.in booth users were talking about workloads rather than anything else
* Amye - we could do like dcos.io
* lack of a Slack community is hindering outreach to DevOps
* note: sankarshan didn't realize that there's a gitter presence
* (Aravinda) Listing of "who uses Gluster" is much the need; like 'Friends of Rust' etc. Often the lack of cited references lowers the confidence in the project
* Documentation will be under docs.gluster.org
* ASCIIdoc (same as Fedora, CentOS etc)
* The coverage of the documentation is not great
* A consultant is engaged to be able to provide Gluster with inputs which lead to substantial improvements (QSG etc in the Gap Analysis Document)
* Conferences and Meetups
* Should focus on meetups
* (Jiffin) used to be Gluster HackDays along side meetups tied with each release(3.x, 4.0) - focused on testing, debugging etc