or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
Default index
TL;DR
The proposal is to have an option
mode.no_default_index
, in which:Why?
The Index can be a source of confusion and frustration for pandas users. For example, let's consider the inputs
. Then:
it can be unexpected that summing
Series
with the same length (but different indices) producesNaN
s in the result ( https://stackoverflow.com/q/66094702/4451315):concatenation, even with
ignore_index=True
, still aligns on the index (https://github.com/pandas-dev/pandas/issues/25349):it can be frustrating to have to repeatedly call
.reset_index()
(https://twitter.com/chowthedog/status/1559946277315641345):With this mode enabled, two major changes would happen:
as_index
option ingroupby
, and allowing forvalue_counts
to not set an index.With this option enabled, users who don't want to worry about indices could safely ignore them.
How?
NoIndex DataFrame
A DataFrame without an index would have an index which would behave like a RangeIndex, except for the following differences:
name
could only beNone
;start
could only be0
,step
1
;Index
should still beNoIndex
;NoIndex
;NoIndex
(sotranspose
would need some adjustments);insert
anddelete
should raise. In particular,.drop
withaxis=0
would aways raise;Don't give people an index unless they ask for one
Some pandas methods create an Index by default. This can sometimes be opted out of (e.g. with
as_index=False
in.groupby
), but other times there is no choice but to callreset_index
after the operation (e.g. with.pivot_table
and.value_counts
).A couple of solutions come to mind:
as_index
options to these methods, whose default could beFalse
under this option;The second would keep API size down, whilst the first one would give the most flexibility to users. I'd be more inclined towards the former.
How to ask for an index?
It should be fine to do
df.reset_index().set_index('index')
, no need to add a new method.Downstream libraries
seaborn
Seaborn makes extensive use of label-based indexing, and so NoIndex DataFrames would break it:
Even if
df
had an Index,seaborn.lineplot
would still error because internally it creates new DataFrames (which now wouldn't have an index) and then it would call things that wouldn't work on them, such asdata.loc[[]]
.This would need some working out.
Why not have
.index
be None, rather than a NoIndex?.index
methods are quite common to call, e.g.https://github.com/pandas-dev/pandas/blob/dbb2adc1f353d9b0835901c274cbe0d2f5a5664f/pandas/core/series.py#L877
in
Roadmap - how to make this change?
In pandas 2.x.0, introduce the
mode.no_default_index
option. It's unlikely that this could ever be made the default, but it could be made the default in a separate namespace (which would try to be compliant with the DataFrame standards API).Resources
pandas issue: https://github.com/pandas-dev/pandas/issues/48880