The rsubgroup package: Algorithms for subgroup discovery and analytics for the R environment!
The rsubgroup package provides a collection of efficient and effective algorithms and tools for subgroup analytics. It features subgroup discovery, community mining, and subsequent analysis for the free R environment.
The package integrates an interface to the org.vikamine.kernel library of the VIKAMINE system (http://www.vikamine.org) - implementing subgroup discovery and pattern mining algorithms in Java, for example, the SD-Map*, SD-Map and the BSD algorithms.
Download and install from CRAN:
Get the latest stable rsubgroup package on CRAN
Development:
Installation of the current development package:
- Download the package.
- Open a shell and go to the download directory.
- Execute "R CMD INSTALL <rsubgroup-package>"
- On Windows, you might need to execute a "SET JAVA_HOME=" before installing the package.
- Also on Windows, a "R CMD INSTALL <rsubgroup-package> --no-multiarch" might be necessary to install the package if you encounter problems with locating the jvm.dll.
Usage hints:
- Check the subgroup discovery examples contained in the documentation.
- .jinit(parameters="-Xmx2048m") allows you to provide extra memory (e.g., 2GB) to the JVM
Important: Please note that this needs to happen before rJava is used in any way.
After the JVM has been initialized (and started), setting the heap space has no effect
any more.
Therefore, it is recommended to execute the .jinit(...) command right
after loading the rJava package.
Example:
library(rJava)
.jinit(parameters="-Xmx2048M") # for two gigabytes heap space, for example
library(rsubgroup)
- In CreateSDTask(source, ...), source can be a data frame or a filename. Providing a file name directly provides the data to the subgroup discovery algorithms on the Java side, which is more memory efficient than converting the data frame to the Java representation.
References and related publications:
[1] Martin Atzmueller (2015) Subgroup Discovery. WIREs Data Mining Knowl Discov, 5:35-49.
[2] Martin Atzmueller and Florian Lemmerich (2012) VIKAMINE - Open-Source Subgroup Discovery, Pattern Mining, and Analytics.
In: Proc. ECML/PKDD, Springer Verlag.
[3] Martin Atzmueller (2007) Knowledge-Intensive Subgroup Mining -- Techniques for Automatic and Interactive Discovery.
Dissertations in Artificial Intelligence-Infix (Diski), (307)IOS Press.
[4] Martin Atzmueller and Frank Puppe (2006) SD-Map - A Fast Algorithm for Exhaustive Subgroup Discovery.
In: Knowledge Discovery in Databases: PKDD 2006, LNCS 4213, pp. 6-17, Springer Verlag.
[5] Martin Atzmueller, Frank Puppe and Hans-Peter Buscher (2005) Exploiting Background Knowledge for Knowledge-Intensive Subgroup Discovery.
In: Proc. 19th International Joint Conference on Artificial Intelligence (IJCAI-05), 647--652, Edinburgh, Scotland.
Last updated: 2021-02-05 by Martin Atzmueller.