- Add
byrow(allequal)as a special case ofbyrow(isequal). (This feature need at least Julia 1.8)
- Fix bug in stat routines - some corner cases
- Fix unnecessary allocations in stat routins
- Fix a performance issue in
sortdue to the recent change inThreads.@threads. - Fix the allocation problem in computing
varandstdin fast path ofgatherby. - Fix an issue with Julia-latest.
- Now we exploit multithreading during gathering observation for huge data sets.
- Fix a problem that was causing tests fail in Julia 1.9
- Fix an issue with
eltypeand the output ofeachcol. Noweltype(::Type{<:DatasetColumns})properly returnsAbstractDatasetColumninstead ofAbstractVector. - Fix a problem with
nonmissingtypewithUnion{}output. - Fix an issue that was causing the join functions sort already-sorted data sets, issue #108
- Remove precompilation for Julia 1.9 - it causes enormous amount of allocation in precompiling and loading
- Now
IMDthrows errors when accesses a grouped data set which its parent is modified.
- Functions
searchsorted,searchsortedfirst, andsearchsortedlastnow works withDatasetColumn - Fix a bug in
byrow(nunique)
- Fix a bug which caused
stable=truebeing ignored ingatherby, issue #100
- Add docstring for
groupby!,groupby, andgatherby.
- Fix issue with
QuickSortAlgin future version of Julia - Empty the rows of a
SubDatasetwithout columns - Fix a bug which causes
modify/combinethrow errors on columns with Vector{Vector} type
- Users can use
resize!to resize a data set
- Fix function signature for some stat functions
- Update to
PrettyTablesversion 2
- Fix a but in
byrowfor writing values of typeBigInt - Update for
JuliaVERSION >= v"1.9.0-DEV.1635" - Fix a bug in
modifywhich causes an error to show an error! - Fix a bug in
sortwhich causes to treatBoolas a vector with length 1
topkandtopkpermuseislessby default for comparing values.- Fix a bug in
showwhich causes ignoring format of a column when calculating the max width. - Better
showforGroupBy/GatherByin Jupyter hcat!keeps the format of the second data set.- Fix an issue in show with HTML MIME, issue #91
- Now
Jupytershows very wide data sets much faster, issue #82 - Add precompilation for Julia > 1.8
- The
topkandtopkpermfunctions supports two extra arguments:ltandbywhich by default are set as<andidentity, respectively topkpermis a new function for outputting the indices of top(bottom) k values issue #67.topknow supports anyDataType, see issue #67.filter,filter!,deleteanddelete!have a new keyword argument for controlling how the missing values should be interpreted issue #69
-
topknow works onDatasetColumn/SubDatasetColumn. -
Stats functions throw
ArgumentErrorwhen an empty vector is passed to them.
-
The
topkandtopkpermfunctions are multithreaded ready, i.e. users can passthreads = trueto these functions.- Now we use binary search for large values of k. This improves the performance of the functions in the worst case scenarios.
-
row_join!allocates less whenmapformats=true, thus, performs better. This directly affectsfilewriterperformance inDLMReader.
- A new functionality has been added to
byrowfor passing a Tuple of column indices.byrow(ds, fun, cols)callsfun.(ds[:, cols[1]], ds[:, cols[2]], ...)whencolsis a NTuple of column indices.
- Fix type ambiguity in
filter/!
- Two new functions:
deleteanddelete!. They should be compared tofilterandfilter!, respectively - issue #63 - Add
DLMReadertosysimageinIMD.create_sysimage.
- Fix mistakes in
byrow(argmin)andbyrow(argmax)- pull #62
byrow(ds, t::DataType, col)convert values ofcoltot.
- Fix an issue in
flatten/!- columns with typeAny. - Fix an issue with
IMD.create_sysimage- issue #59 - Improve
eachgroup - Drop support of
UInt16inCharacters-Charactersnow only supports length
- Users now can choose between having the observations ids for the left data set and/or the right data set as part of the output data set.
- Add a new function
eachgroup. It allows iteration over each group of a grouped data set. opis a new keyword argument for theupdate/!functions which allows passing a user defined function to control how the value of the main data set should be updated by the values from the transaction data set. (issue #55)- Supporting of the
mapformatskeyword argument inflatten/!. Now users can flatten a data set based on the formatted values. (issue #57) - Support of the
threadskeyword argument inflatten/!.
- The
combinefunction will now work fine when a view of data set is passed - For the join functions the
makeuniqueargument is now passed correctly to the inside functions. updateandupdate!have the samemodeoption by default.- Fix the problem with preserving format of
SubDatasetinflatten/! - Fix the problem that caused
flatten!to produce a copy of data when an empty data set were passed to it. - Fix the bug in
flatten!related to flatten the first column. - Fix the bug in
flattenthat caused Segmentation fault for view of data sets.
- Faster
flatten/!
- The
outerjoinfunction accepts thesourcekeyword argument. - All join functions support
obs_idoption. This allows to output obs id for the matched pairs.- All join functions support
obs_id_namefor assigning column names forobs_id.
- All join functions support
- The
leftjoin/!,innerjoinandouterjoinfunctions supportmultiple_matchoption. This indicates the rows in the left data set that has been repeated in the output data set due to multiple matches in the right data set.- All join functions support
multiple_match_namefor assigning the column name formultiple_match.
- All join functions support
- The
comparefunction is updated to support more complex comparisons(issue #53).- [BREAKING] the
onkeyword argument in previous versions is equivalent to thecolskeyword argument in version 0.7.0+. - The
comparefunction can compare two data sets with different number of rows. - User can pass key columns to
compare, via theonkeyword argument, for matching observations before comparing. - Few keyword arguments are added to
comparefor supporting new functionalities.
- [BREAKING] the
- The
maximumandminimumfunctions now work properly withStringcolumns.