The fBasics Package February 22, 2006 Version 221.10065 Date 1996 - 2006 Title Rmetrics - Marketes and Basic Statistics Author Diethelm Wuertz and many others, see the SOURCE file Depends R (>= 1.9.0), methods Maintainer Diethelm Wuertz <wuertz@itp.phys.ethz.ch> Description Environment for teaching “Financial Engineering and Computational Finance” License GPL Version 2 or later URL http://www.rmetrics.org R topics documented: MarketStatistics . . . . . . . TimeSeriesImport . . . . . . PlotFunctions . . . . . . . . fBasicsData . . . . . . . . . ZivotWangData . . . . . . . StableDistribution . . . . . . HyperbolicDistribution . . . SmoothedSplineDistribution DistributionFits . . . . . . . StylizedFacts . . . . . . . . BasicStatistics . . . . . . . . PortableRandomInnovations HypothesisTesting . . . . . . OneSampleTests . . . . . . TwoSampleTests . . . . . . fBasicsTools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 4 8 12 14 17 20 22 24 28 32 35 36 39 44 48 50 1 2 MarketStatistics MarketStatistics Import Market Data from the Internet Description A collection and description of functions to extract financial and economic market statistics from the data available in the CIA World Factbook and from the exchange data collected by the World Federation of Stock Exchanges. The functions are: ciaCountries ciaIndicators ciaByCountry ciaByIndicator Returns a list of CIA country codes, Returns a list of CIA indicator codes, Returns all Indicators by country, Returns for all countries indicator ranking. To load statistics from the WFE: data(wfe1) data(wfe2) data(wfe3) data(wfe4) data(wfe5) data(wfe6) Market capitalization of domestic companies, Total number of companies with shares listed, Total value of share trading, Market value of bonds listed, Total value of bond trading, and Price earning ratio an gross dividend yield. Usage ciaCountries() ciaIndicators() ciaByCountry(code = "CH", from = FALSE, names = FALSE, details = TRUE) ciaByIndicator(code = 2001, from = FALSE, details = TRUE) ## S3 method for class 'ciaCountries': print(x, ...) ## S3 method for class 'ciaIndicators': print(x, ...) Arguments code [ciaByCountry] a character string denoting the country code. [ciaByIndicator] a character string or integer denoting the indicator code. details a logical flag. Should details be printed? By default TRUE. from a logical flag. If set to TRUE an additional column will be returned with the information when the data were recorded. names a logical flag. If set to TRUE" then the full names of the countries will be returned in an additional column MarketStatistics 3 x x an object of class ciaCountries or ciaIndicators as returned by the functions ciaCountry or ciaIndicator, respectively. ... arguments to be past to the print method. Details Financial and economic market statistics can be found at several web pages for free. The "OECD Factbook" from the ’Organisation for Economic Co-operation and Development’, www.oecd.org, "The World Factbook" from the ’Central Intelligence Agency’ of the US, www.cia.gov, and the "Penn World Tables" from the ’Center for International Comparisons’ at University of Pennsylvania, pwt.econ.upenn.edu, offer sources of economic, environmental and social indicators for the world’s core economies. Statistical data from the exchanges around the world can be obtained from the ’World Federation of Stock Exchanges’, www.fibv.com. Further sources of statistical data can be found on the web pages of the ’Bank for International Settlement’, www.bis.org, and on the web pages of the ’International Monetary Fund’, www.imf.org. Value ciaCountries returns a data frame with countries and contry codes. ciaIndicators returns a data frame with indicator codes. ciaByCountry returns a data frame with indicators by country. ciaByIndicator returns a data frame with ranked data for a given indicator. Author(s) Diethelm Wuertz for the Rmetrics R-port. References CIA, 2004, CIA Factbbook 2004, http://www.cia.gov/cia/publications/factbook. WFE, 2004, World Federation of Stock Exchanges, Focus 2004, http://www.world-exchanges.org. Examples ## SOURCE("fBasics.11A-MarketStatistics") ## Pie Chart from CIA Oil Production Indicator (Code 2173): # Search for Code: ciaIndicators() # Create Pie Chart: OilProduction = as.integer(as.vector(ciaByIndicator(2173)[2:11, 2])) names(OilProduction) = as.vector(ciaByIndicator(2173)[2:11,1]) OilProduction pie(OilProduction,col = rainbow(10)) 4 TimeSeriesImport title(main = "Oil Production 2004\n bbl/day") mtext("Source: CIA World Factbook", side = 1) ## Barplot from WFE Capitalization Statistics: # Extract Capitalization of/at: # NYSE: 7, Tokyo: 37, London: 22, Frankfurt: 15 # 1991 - 2003 triannual: 3,6,9,12,15 data(wfe1) Table =t(wfe1[c(7,37,22,15),c(3,6,9,12,15)])/1e6 colnames(Table) = c("NewYork", "Tokyo", "London", "Frankfurt") rownames(Table) = as.character(seq(1991, 2003, by = 3)) Table # Create Barplot: barplot(Table, beside = TRUE, legend = rownames(Table), col = c("lightblue", "mistyrose", "lightcyan", "lavender", "cornsilk")) title(main = "Stock Market Capitalization\n 1991 - 2003") mtext("Source: World Federation of Exchanges", side = 4, line = -2, cex = 0.7) TimeSeriesImport Import Market Data from the Internet Description A collection and description of functions to import financial and economic market from the Internet. Download functions are available for economic and financial market data form Economagic’s, from Yahoo’s, and from the Federal Reserve’s Internet sites. The functions are: economagicImport yahooImport keystatsImport fredImport forecastsImport Economic series from Economagic’s Web site, daily stock market data from Yahoo’s Web site, key statistics from Yahoo’s Web site, time series from St. Louis FRED Web site, monthly data from the Financial Forecast Center. Usage economagicImport(query, file = "tempfile", source = "http://www.economagic.com/em-cgi/data.exe/", frequency = c("quarterly", "monthly", "daily"), save = FALSE, colname = "VALUE", try = TRUE) yahooImport(query, file = "tempfile", source = "http://chart.yahoo.com/table.csv?", save = FALSE, sep = ";", swap = 20, try = TRUE) keystatsImport(query, file = "tempfile", source = "http://finance.yahoo.com/q/ks?s=", save = FALSE, try = TRUE) fredImport(query, file = "tempfile", source = "http://research.stlouisfed.org/fred2/series/", frequency = "daily", save = FALSE, sep = ";", try = TRUE) forecastsImport(query, file = "tempfile", TimeSeriesImport 5 source = "http://www.forecasts.org/data/data/", save = FALSE, try = TRUE) show.fWEBDATA(object) ## S3 method for class 'keystats': print(x, ...) Arguments colname [economagicImport] a character string which defines the name of the value column. By default "VALUE". file a character string with filename, usually having extension ".csv", where to save the downloaded data. frequency a character string, one of "quarterly", "monthly", or "daily", defining the frequency of the data records. object an S4 object of class "fWEBDATA". query a character string, denoting the location of the data at the web site. save a logical value, if set to TRUE the downloaded data file will be stored under the path and file name specified by the string file. By default FALSE. sep a character value, defining the field separator for the destination file which saves the downloaded data records. By default a semicolon. source a character string with the download URL. swap [yahooImport] an integer value which determines when we swap from the 19th to 20th century, by default 20, i.e. we swap 1920. This is necessary since Yahoo does not list the century in its dates, e.g. "15-Aug-02". try a logical value, if set to TRUE the Internet access will be checked. x [print.keystats] an object of class keystats as returned by the function keystatsImport. ... optional arguments passed to the print.keystats function. Details Import data from www.economagic.com Frequently requested data files from Economagic for the US economy include: [query] Description: var/leading-ind-long beana/t102l01 fedstl/trsp500 fedstl/gnp var/cpiu-long feddal/ru fedstl/indpro fedstl/exjpus+2 fedstl/fedfunds+2 fedstl/mdiscrt+2 Index of Leading Economic Indicators Real Gross Domestic Product SP 500 Total Return Gross National Product in Current Dollars Consumer Price Index - All Urban Consumers Unemployment Rate Total Industrial Production Index FX Rate: Japanese Yen to one US Dollar Federal Funds Rate Discount Rate 6 TimeSeriesImport fedbog/tcm30y+2 fedstl/mprime+2 fedstl/tb3ms+2 fedstl/tb6ms+2 fedbog/cm+2 var/west-texas-crude-long 30-Year Treasury Constant Maturity Rate Bank Prime Loan Rate 3-Month Treasury Bills - Secondary Market 6-Month Treasury Bills - Secondary Market 30 Year Federal Home Loan Mortgages Price of West Texas Intermediate Crude Import data from chart.yahoo.com: The query string is given as s=SYMBOL&a=DD&b=MM&c=CCYY&g=d&q=q&z=SYMBOL&x=.csv where SYMBOL has to replaced by the symbol name of the instrument, and DD, MM, and CCYY by the day, month-1 and century/year when the time series should start. Here are some examples of symbols: [query] Description: ^DJI ^NYA ^NDX ^IXIC ^TYX IBM KO Dow Jones 30 Industrial Averages New York Stock Exchange Composite Nasdaq 100 Index Nasdaq Composite Index US 30Y Treasury Bond Index BM DJIA Stock Coca-Cola DJIA Stock The meaning of the tokens in the query string are the following: Token Description s a b c d e f z Selected Ticker-Symbol First Quote starts with Month (mm) First Quote starts with Day (dd) First Quote starts with Year (ccyy) Last Quote ends with Month (mm) Last Quote ends with Day (dd) Last Quote ends with Year (ccyy) Selected Ticker-Symbol Note, that month tokens range between 0 and 11 for January to December! Key statistics data from finance.yahoo.com: The functions downloads the key statistics for the specified equity query and returns the result as a two column data frame. The key names included are: "Market Cap", "Enterprise Value", "Trailing P/E", "Forward P/E", "PEG Ratio", "Price/Sales", "Price/Book", "Enterprise Value/Revenue", "Enterprise Value/EBITDA", "Annual Dividend", "Dividend Yield", "Beta", "52-Week Change", "52-Week High", and "52-Week Low". TimeSeriesImport 7 Value The functions economagicImport, fredImport, and yahooImport return an S4 object of class fWEBDATA with the following slots: @call the function call. @data the data as downloaded formatted as a data.frame. @param a character vector whose elements contain the values of selected parameters of the argument list. @title a character string with the name of the download. This can be overwritten specifying a user defined input argument. @description a character string with an optional user defined description. By default just the current date when the test was applied will be returned. The function keystatsImport returns an S3 object of class keystsats with two entries: $query holds the query name, and keystats holds the dataframe with the statistical values. Note Internet Download: Note, that if the service provider changes the data file format it may become necessary to modify and update the functions. The R package tseries from Adrian Trapletti offers an alternative function to download stock market data and indexes from Yahoo’s Internet site. Author(s) Diethelm Wuertz for the Rmetrics R-port. Examples ## Not run: ## SOURCE("fBasics.12A-TimeSeriesData") ## economagicImport xmpBasics("\nStart: Daily Foreign Exchange Rates > ") USDEUR = economagicImport(query = "fedny/day-fxus2eu", frequency = "daily", colname = "USDEUR") # Print Data Slot if Internet Download was Successful: if (!is.null(USDEUR)) print(USDEUR@data[1:20, ]) ## economagicImport xmpBasics("\nNext: USFEDFUNDS Monthly US FedFunds Rates > ") USFEDFUNDS = economagicImport(query = "fedstl/fedfunds+2", frequency = "monthly", colname = "USFEDFUNDS") if (!is.null(USFEDFUNDS)) print(USFEDFUNDS@data[1:20, ]) ## economagicImport xmpBasics("\nNext: USDGNP Quarterly GNP Data Records > ") USGNP = economagicImport(query = "fedstl/gnp", frequency = "quarterly", colname = "USGNP") if(!is.null(USGNP)) print(USGNP@data[1:20, ]) ## yahooImport - 8 PlotFunctions xmpBasics("\nNext: IBM Shares from Yahoo > ") # [test 19/20 century change 01-12-1999 -- 31-01-2000] query = "s=IBM&a=11&b=1&c=1999&d=0&q=31&f=2000&z=IBM&x=.csv" IBM = yahooImport(query) if (!is.null(IBM)) print(IBM@data[1:20, ]) ## keystatsImport xmpBasics("\nNext: Key Statistics IBM Shares from Yahoo > ") keystatsImport("IBM") ## fredImport xmpBasics("\nNext: DPRIME Daily Bank Prime Load Rate Records > ") DPRIME = fredImport("DPRIME") if (!is.null(DPRIME)) print(DPRIME@data[1:20, ]) ## End(Not run) PlotFunctions Basic Plot Functions and Utilities Description A collection and description of several plot functions and utilities which may be useful for the explorative data analysis of financial and economic market data using S4 time series objects from Rmetrics. Included are also utility functions displaying tables for characters, plot symbols, and colors. The functions are: tsPlot histPlot densityPlot circlesPlot perspPlot perspPlot characterTable plotcharacterTable colorTable greyPal splusLikePlot Time Series Plot, Histogram Plot of (Series) Data, Kernel Density Estimate Plot, Scatterplot of Circles Indexing a 3rd Variable, Perspective Plot in 2 Dimensions, Contour Plot in 2 Dimensions, Table of Numerical Equivalents to Latin Characters, Table of plot characters, plot symbols, Table of Color Codes and Plot Colors itself, Creates a grey palette like rainbow does for colors, Scales plotting and symbols. Usage tsPlot(x, ...) histPlot(x, col = "steelblue4", border = "white", main = x@units, add.fit = TRUE, ...) densityPlot(x, col = "steelblue4", main = x@units, add.fit = TRUE, ...) circlesPlot(x, y, size = 1, ...) perspPlot(x, y, z, theta = -40, phi = 30, col = "steelblue4", ps = 9, ...) contourPlot(x, y, z, ...) characterTable(font = 1, cex = 0.7) PlotFunctions 9 plotcharacterTable(font = par('font'), cex = 0.7) colorTable(cex = 0.7) greyPal(n = 64, start = 255-n, end = 255) Arguments add.fit [histPlot][densityPlot] a logical flag, if set to TRUE which is the default value than a normal fit will be added to the plot, otherwise not. col, border two character strings, defining the colors used to fill the bars and to plot the borders of the bars for the histPlot, or color and linetype as used by the function plot. font, cex an integer value, the number of the font, by default font number 1, the standard font for the characterTable or the current plot character font for the plotcharacterTable. The character size is determined by the numeric value cex, the default size is 0.7. main character string of main title(s). n, start, end [greyPal] - cr n gives the number of greys to be constructed, start and end span the range of the color palette. By default 64 grey tones equidistant chosen from the color range (191, 191, 191) to (255, 255, 255). a numeric vector like z, the third variable in the function circlesPlot. The argument gives the size of the circles, by default all values are set to 1. theta, phi, ps plot parameters for the function perspPlot as used by the function persp. size x, y [tsPlot][histPlot][densityPlot] an object of class timeSeries. [circlesPlot][perspPlot] numeric vectors. In the case of the thermometerPlot x holds the current values of the n-bars. z a matrix containing the values to be plotted by the function perspPlot. ... arguments to be passed to the underlying plot function. Details Series Plots: tsPlot plots time series on a common plot. Unlike plot.ts the series can have a different time bases, but they should have the same frequency. tsPlot is a synonyme function call for R’s ts.plot from the the ts package. Histogram and Density Plots: histPlot and densityPlot show (return) distributions in form of a histogram and density plots. The first is a synonyme function call for R’s histogram plot, calling the function hist. The outlook of the plot is more SPlus like. The second creates a density Plot with underlying kernel density estimation. It is a function call to R’s density function. Three Dimensional Plots: 10 PlotFunctions circlesPlot and perspPlot, see also contour, are functions to plot 3 dimensional data sets. The first is simple scatterplot with points replaced with variable circles, whose size indexes a third variable. The second aloows to create a perspective 3 dimensional plot. It is a function call to R’s persp plot, but the parameters are setted to produce a more SPlis like outlook of the plot. contour is the call to R’s contour plot from the base package. Plot Utilities: characterTable, plotcharacterTable and colorTable are three helpful utilities for using characters and colors in plots. The first function displays a table of numerical equivalents to Latin characters, the second displays a table of plot characters, i.e. the symbols used in plots, and the third displays a table with the codes of the default plot colors. Value tsPlot plots a time series. Just a synonyme call to the function ts.plot changing plot type to lines and plot color to steelblue3. histPlot plots a histogram. This is a synonyme function call for R’s histogram plot, calling the function hist. Returns an object of class "histogram", see hist. densityPlot returns an object of class "density", see density. circlesPlot a simple pseudo three dimensional scatterplot of circels whose sizes index a thrid variable. perspPlot contourPlot draws perspective or contour plots of surfaces over the x-y plane. characterTable displays a table with the characters of the requested font. The character on line "xy" and column "z" of the table has code " xyz", e.g cat("\126") prints: V for font number 1. These codes can be used as any other characters. plotcharacterTable displays a table with the plot characters numbered from 0 to 255. colorTable displays a table with the plot colors with the associated color number. Author(s) Gordon Smyth for the circlesPlot, Pierre Joyet for the characterTable, and Diethelm Wuertz for the Rmetrics R-port. PlotFunctions 11 Examples ## SOURCE("fBasics.12B-PlotFunctions") ## Not run: ## tsPlot xmpBasics("\nStart: European Stock Markets > ") # Show multiple plot: par(mfrow = c(1, 1), cex = 0.7) data(DowJones30) DowJones.ts = as.timeSeries(DowJones30)[, c("CAT", "GE", "IBM", "JPM", )] tsPlot(DowJones.ts) title(main = "CAT - GE - IBM - JPM") ## histPlot xmpBasics("\nNext: Histogram Plot of Normal Random Numbers > ") DowJones.ret = returnSeries(DowJones.ts) par(mfrow = c(2, 2), cex = 0.7) histPlot(x = DowJones.ret) ## densityPlot xmpBasics("\nNext: Density Plot of Normal Random Numbers > ") densityPlot(x = DowJones.ret) ## circlesPlot xmpBasics("\nNext: 3D Circles Plot of Normal Random Numbers > ") par(mfrow = c(1, 1), cex = 0.7) circlesPlot(x = rnorm(50), y = rnorm(50), size = abs(rnorm(50)), main = "Circles Plot") ## perspPlot xmpBasics("\nNext: Perspective Plot > ") par(mfrow = c(1, 1)) x = y = seq(-10, 10, length = 51) f = function(x, y) { r = sqrt(x^2+y^2); 10 * sin(r)/r } z = outer(x, y, f) perspPlot(x, y, z) title(main = "Perspective Plot", line = -3) ## characterTable xmpBasics("\nNext: Print the Copyright Sign > ") cat("\251 \n") ## characterTable xmpBasics("\nNext: Display Character Table for Symbol Font > ") characterTable(5) ## colorTable xmpBasics("\nNext: Display Table of Plot Colors > ") colorTable() ## plotcharacter Table xmpBasics("\nNext: Display Table of Plot Characters > ") plotcharacterTable() ## End(Not run) 12 fBasicsData fBasicsData fBasics Data Sets Description A collection and description of data sets used in the examples. Included are data files for high frequency FX data, time and sales data for financial futures, and market returns for selected stock and market indexes. Included are also tables for the finite sample Jarque-Bera test. The data sets are: audusd.csv usdthb.csv usddem30u.csv usdchf.csv fdax9710.csv fdax97m.csv bmwres.csv nyse.csv nyseres.csv jbLM jbALM PhiStable Reuters Tick-by-Tick AUDUSD rates 1997-10, Reuters Tick-by-Tick USDTHB rates 1997, Reuter 30 min USDDEM rate in upsilon time, Reuters 30 min USDCHF Rates 199604-200103, Minute-by-Minute DAX Futures Prices for 1997-10*, Minutely Time and Sales DAX Futures for 1997, Daily log Returns of German BMW Stock Proces, Daily Values of the NYSE Composite Index, Daily log Returns of the NYSE Composite Index, Table for the Jarque Bera Lagrange Multiplier test, Table for the JB Augmented LM finite sample test, Table of Contours for Stable Parameter Estimation. The CIA Factbook: ciaFactbook.R Macroeconomic data from the CIA Factbook. The Tables from World Federarion of Exchanges , WFE: wfe1.csv wfe2.csv wfe3.csv wfe4.csv wfe5.csv wfe6.csv Table 1, market capitalization of domestic companies, Table 2, total number of companies with shares listed, Table 3, total value of share trading, Table 4, market value of bonds listed, Table 5, total value of bond trading, Table 6, price earning ratio an gross dividend yield. *The file fdax97m.csv is too large and therefore not part of the fBasics distribution. Please contact inf@rmetrics.org. Details High Frequency Data for the AUDUSD and USDTHB: audusd and usdthb archive high frequency exchange rates for the Australian / US Dollar exchange rate in October 1997 and exchange rates for the US Dollar / Thailand Bhat exchange rate in June 1997: A comma delimited CSV file with 6 columns. The first column, named XDATE, contains date/time entries in ISO-8601 format as [CCYYMMDDhhmm], the second column, named DELAY, gives the delay in minutes between the time stamp of Reuter’s data record and arrival time fBasicsData 13 at the local database, the third column named CONTRIBUTOR is Reuter’s identification, a 4 character code, the fourth and fifth column, named BID and ASK are the bid and ask price quotations, and finally the sixth column, named FLAG, is not used and has zeros as entries. 30 Minutes Data for the USDDEM Rate in Upsilon time: usddem30u archives 30 min USDDEM bid and ask for the US Dollar German Mark exchange rate ranging from 1992-10-01 00:09 until 1997-05-30 21:49. A comma delimited CSV file with 3 columns. The first column, named "%Y%m%d%H%M", contains date/time entries in ISO-8601 format as [YYYYMMDDhhmm], the second column, named BID, gives the bid prices, and the third column named ASK, gives the ask prices retrieved from the Reuter’s data records. 30 Minutes Data for the USDCHF Rate: usdchf archives 30 min USDCHF midprices for the US Dollar Swiss Franc exchange rate ranging from 1996-04-01 until 2001-03-30. A comma delimited CSV file with 2 columns. The first column, named "%Y%m%d%H%M", contains date/time entries in ISO-8601 format as [YYYYMMDDhhmm], the second column, named USDCHF, gives the prices retrieved from the Reuter’s data records. DAX Futures Data: fdax9710 archives returns of minute-by-minute prices for Dax Futures in October 1997: A comma delimited CSV file with 2 columns. The first column, named XDATE, contains date/time entries in ISO-8601 format as [CCYYMMDDhhmm], the second column, named FDAX, gives an averaged price of the Dax Futures, i.e. the mean of all volume weighted time and sales within the same minute. fdax97m archives returns for minute-by-minute prices for Dax Futures in 1997: A comma delimited CSV file with 2 columns. The first column, named XDATE, contains date/time entries in ISO-8601 format as [CCYYMMDDhhmm], the second column, named FDAX, gives a minutely averaged price during opening hours of the exchange, i.e. the mean of all volume weighted time and sales within the same minute. Log Returns for BMW Shares and NYSE Composite Index: bmwres and nyseres archive log returns of the German BMW stock listed in the German DAX30 and log returns of the NYSE Composite Index, both on a daily trading day time scale just numbering the log returns: A one column CSV file with column names BMW or NYSERES, respectively. The entries are the differences of the logarithmic prices on two succeeding trading days. nyse contains two column data recors, the date and the NYSE Composite Index. Jarque Bera Normality Test: jbLM and jbALM are finite sample tables for the Jarque Bera Lagrange multiplier and augmented Lagrange multiplier normality test. The columns denote the sample sizes and the row the probabilities. Stable Parameter Estimation: .PhiStable is a list object containing two tables for the estimation of the parameters of a stable distribution using McCullochs approach. 14 ZivotWangData Source audusd usdthb usdthb The data were collected by D. Wuertz and R. Schnidrig from the Reuter’s data feed. fdax9710 fdax97m The data were extracted from time and sales data records from the Frankfurt Futures Exchange. bmwres The data were published in the EVIS software package. nyse nyseres The data were downloaded from the web site of the New York Stock Exchange and the residuals were calculated as logarithmic price differences. http://www.nyse.com. jbLM jbALM Monte Carlo Simulations by D. Wuertz and H.G. Katzgraber. Examples ## SOURCE("fBasics.12C-fBasicsData") ## plot xmpBasics("\nStart: Plot Residuals NYSE Composite Index > ") data(nyseres) x = as.ts(nyseres) par(mfrow = c(2, 1), cex = 0.75) plot(100*x, type = "l", col = "steelblue4", main = "NYSE Composite Index") grid() plot(cumsum(x), type = "l", col = "steelblue4", main = "Cumulated NYSE Index") grid() ZivotWangData fBasics Data Sets from MoFiTS Description A collection and description of public data sets used in the examples of the book ’Modeling Financial Time Series with S-Plus’ written by E. Zivot and J. Wang. The data sets are: CPI.dat Seasonally adjusted US Consumer Price Index, ZivotWangData DowJones30.csv ford.s.csv highFreq3M.df hp.s.csv IP.dat lexrates.dat msft.dat shiller.dat shiller.annual singleIndex.dat.csv varex.ts.csv yhoo.df 15 Dow Jones Industrial Average Index, Daily returns of the Ford stock, 3M High Frequency Stock Market Data, Daily returns of the HP stock, Seasonally Adjusted US Industrial Production Index, Spot and Forward Exchange Rate Data, Open, High, Low, Close of Microsoft Stocks, Robert Shiller’s Monthly Economic Data Set, Robert Shiller’s Monthly Economic Data Set, Microsoft Stocks and SP500 Index Data, Real Stock Returns and Output Growth Data, Yahoo Stock Information. Details Dow Jones Industrial Average: The file DowJones30.csv contains closing prices for the 30 stocks represented in the Dow Jones Industrial Average Index. Data are downloadable and can be updated from Yahoo’s web site. Microsoft Stocks and SP500 Index Data: The file singleIndex.dat.csv contains the monthly closing prices for Microsoft Corporation and the SP 500 index. Open, High, Low, Close of Microsoft Stocks: The file msft.dat.csv contains data representing the open, high, low, close and volume information for Microsoft stocks. Data are downloadable and can be updated from Yahoo’s web site. IP and CPI Index Data: The file IP.dat.csv contains data representing seasonally adjusted US Industrial Production Index and the file CPI.dat.csv contains data representing seasonally adjusted US Consumer Price Index. Data are downloadable and can be updated from Economagics’s web site. Ford and HP Stock Returns: The files ford.s.csv and hp.s.csv contain data representing 2000 daily stock returns for the Ford and HP shares traded at NYSE. The time series span the period from Feburary 2, 1984, to December 31, 1991. Data are downloadable and can be updated from Yahoo’s web site. 3M high Frequency Data: The file highFreq3M.df.csv holds Tsay’s data for 3M. Date information is expressed as the day of the month and the number of seconds from midnight Data is for December 1999. Columns are: day - integer representing the trading day of the month, sec - trade.time integer representing the 16 ZivotWangData trading time recorded as the number of seconds from midnight, price transaction price in dollars. Downloadable from: http://www.gsb.uchicago.edu/fac/ruey.tsay/teaching/fts Spot and Forward Exchange Rate Data: The file lexrates.dat.csv holds log spot exchange and forward exchange rates between USCNS, USCNF - USD and CAD, Canadian Dollar, USDMS, USDMF - USD and DEM, Deutsche Mark, USFRS, USFRF - USD and FFR, French Franc, USILS, USJYF - USD and JPY, Italian Lira, and USUKS, USUKF - USD and GBP, Japanese Yen. Source: Thompson Financial, formerly Datastream, see also: Zivot, E. (2000) Cointegration and forward and spot exchange rate regressions, Journal of International Money and Finance, 19, 387– 401 and 785–812. Robert Shillers Monthlly and Annual Economic Data: The files shiller.dat.csv and shiller.annual.csv hold data used in the book "Irrational Exuberance" by Robert Shiller. The data are price - monthly nominal US SP stock market prices, dividend - nominal SP Composite Index dividends, earnings - nominal SP Composite Index earnings, cpi - US Consumer Price Indexes, real.price - real US stock market prices, real.dividend - real SP Composite Index dividends, real.earnings - real SP Composite Index earnings, pe.10 - price-earnings ratios, dp.ratio - dividend-price ratios, dp.yield dividend-price yield. The last two are only listed in shiller.annual. The series start January 1871 and end on March 2001. Data are available from: Shiller, R.J. (1989) Market Volatility, MIT Press. Shiller, R.J. (2001) Irrational Exuberance, Broadway Books. Yahoo Stock Information: The file yhoo.df.csv contains data representing daily transaction information of Yahoo stock, with the following six columns: Date, Open, High, Low, Close, Volume. Data are downloadable and can be updated from Yahoo’s web site. Real Stock Returns and Output Growth Data: The file varex.ts.csv contains real stock returns and output growth data. The column MARKET.REAL lists continuously compounded real returns on the SP500 index, the column RF.REAL lists real interest rates of 30-day US Treasury Bills, the column INF lists continuously compounded growth rate of US CPI, and the column IPG lists continuously compounded growth rate of US industrial production. The data are monthly starting December 1959 and ending December 2000. Data are downloadable and can be updated from Economagic’s web site. Examples ## SOURCE("fBasics.12D-ZivotWangData") ## Not run: ## DowJones30 xmpBasics("\nStart: Dow Jones Industrial Average > ") data(DowJones30) class(DowJones30) DowJones30.ts = as.timeSeries(DowJones30) StableDistribution 17 class(DowJones30.ts) head(DowJones30.ts) ## End(Not run) StableDistribution Stable Distribution Function Description A collection and description of functions to compute density, distribution function, quantile function and to generate random variates, the stable distribution, and the stable mode. Two different cases are considered, the first for the symmetric and the second for the skewed distribution. The functions are: [dpqr]symstb [dpqr]stable symstbSlider stableSlider The symmetric stable distribution, the skewed stable distribution, interactive symmetric distribution display, interactive stable distribution display. Usage dsymstb(x, psymstb(q, qsymstb(p, rsymstb(n, alpha) alpha) alpha) alpha) stableMode(alpha, beta) dstable(x, pstable(q, qstable(p, rstable(n, alpha, alpha, alpha, alpha, beta, beta, beta, beta, gamma gamma gamma gamma = = = = 1, 1, 1, 1, delta delta delta delta = = = = 0, 0, 0, 0, pm pm pm pm = = = = c(0, c(0, c(0, c(0, 1, 1, 1, 1, 2)) 2)) 2)) 2)) symstbSlider() stableSlider() Arguments alpha, beta, gamma, delta value of the index parameter alpha with alpha = (0,2]; skewness parameter beta, in the range [-1, 1]; scale parameter gamma; and shift parameter delta. n number of observations, an integer value. p a numeric vector of probabilities. pm parameterization, an integer value by default pm=0, the ’S0’ parameterization. x, q a numeric vector of quantiles. 18 StableDistribution Details Symmetric Stable Distribution: For the density and probability the approach of McCulloch is implemented. Note, that McCulloch’s approach has a density precision of 0.000066 and a distribution precision of 0.000022 for alpha in the range [0.84, 2.00]. Quantiles are evaluated from a root finding process via the probability function. Thus, this leads to nonnegligible errors for small quantiles, since the quantile evaluation depends on the quality of the probability function.To achieve higher precisions use the function stable with argument beta=0. For generation of random deviates the results of Chambers, Mallows, and Stuck are used. Skew Stable Distribution: The function uses the approach of J.P. Nolan for general stable distributions. Nolan derived expressions in form of integrals based on the charcteristic function for standardized stable random variables. These integrals are numerically evaluated using R’s function integrate. "S0" parameterization [pm=0]: based on the (M) representation of Zolotarev for an alpha stable distribution with skewness beta. Unlike the Zolotarev (M) parameterization, gamma and delta are straightforward scale and shift parameters. This representation is continuous in all 4 parameters, and gives an intuitive meaning to gamma and delta that is lacking in other parameterizations. "S" or "S1" parameterization [pm=1]: the parameterization used by Samorodnitsky and Taqqu in the book Stable Non-Gaussian Random Processes. It is a slight modification of Zolotarev’s (A) parameterization. "S*" or "S2" parameterization [pm=2]: a modification of the S0 parameterization which is defined so that (i) the scale gamma agrees with the Gaussian scale (standard dev.) when alpha=2 and the Cauchy scale when alpha=1, (ii) the mode is exactly at delta. "S3" parameterization [pm=3]: an internal parameterization. The scale is the same as the S2 parameterization, the shift is −beta ∗ g(alpha), where g(alpha) is defined in Nolan [1999]. Value All values for the *symstb and *stable functions are numeric vectors: d* returns the density, p* returns the distribution function, q* returns the quantile function, and r* generates random deviates. The function stableMode returns a numeric value, the location of the stable mode. The functions symstbSlider and stableSlider display for educational purposes the densities and probabilities of the symmetric and skew stable distributions. Author(s) McCulloch for the ’symstb’ Fortran program, and Diethelm Wuertz for the Rmetrics R-port. References Chambers J.M., Mallows, C.L. and Stuck, B.W. (1976); A Method for Simulating Stable Random Variables, J. Amer. Statist. Assoc. 71, 340–344. Nolan J.P. (1999); Stable Distributions, Preprint, University Washington DC, 30 pages. Nolan J.P. (1999); Numerical Calculation of Stable Densities and Distribution Functions, Preprint, University Washington DC, 16 pages. StableDistribution 19 Samoridnitsky G., Taqqu M.S. (1994); Stable Non-Gaussian Random Processes, Stochastic Models with Infinite Variance, Chapman and Hall, New York, 632 pages. Weron, A., Weron R. (1999); Computer Simulation of Levy alpha-Stable Variables and Processes, Preprint Technical Univeristy of Wroclaw, 13 pages. Examples ## SOURCE("fBasics.13A-StableDistribution") ## rsymstb xmpBasics("\nStart: Symmetric Stable Distribuion: > ") par(mfcol = c(3, 2), cex = 0.7) set.seed(1953) r = rsymstb(n = 1000, alpha = 1.9) plot(r, type = "l", main = "symstb: alpha = 1.9") # Plot empirical density and compare with true density: hist(r, n = 25, probability = TRUE, border = "white", col = "steelblue4") x = seq(-5, 5, 0.1) lines(x, dsymstb(x = x, alpha = 1.9)) # Plot df and compare with true df: plot(sort(r), (1:1000/1000), main = "Probability", col = "steelblue4") lines(x, psymstb(x, alpha = 1.9)) # Compute quantiles: qsymstb(psymstb(q = seq(-10, 10, 1), alpha = 1.9), alpha = 1.9) ## stable xmpBasics("\nNext: Skew Stable Distribuion: > ") # Compared to R, this might be quite slow under S-Plus ... set.seed(1953) r = rstable(n = 1000, alpha = 1.9, beta = 0.3) plot(r, type = "l", main = "stable: alpha=1.9 beta=0.3") # Plot empirical density and compare with true density: hist(r, n = 25, probability = TRUE, border = "white", col = "steelblue4") x = seq(-5, 5, 0.4) lines(x, dstable(x = x, alpha = 1.9, beta = 0.3)) # Plot df and compare with true df: plot(sort(r), (1:1000/1000), main = "Probability", col = "steelblue4") lines(x, pstable(q = x, alpha = 1.9, beta = 0.3)) # Compute quantiles: qstable(pstable(seq(-4, 4, 1), alpha = 1.9, beta = 0.3), alpha = 1.9, beta = 0.3) ## stable xmpBasics("\nNext: Paramterization S1: > ") set.seed(1953) r = rstable(n = 1000, alpha = 1.9, beta = 0.3, pm = 1) plot(r, type = "l", main = "S1 stable: alpha=1.9 beta=0.3") # Plot empirical density and compare with true density: hist(r, n = 25, probability = TRUE, border = "white", col = "steelblue4") x = seq(-5, 5, 0.4) lines(x, dstable(x = x, alpha = 1.9, beta = 0.3)) # Plot df and compare with true df: plot(sort(r), (1:1000/1000), main = "Probability", col = "steelblue4") lines(x, pstable(q = x, alpha = 1.9, beta = 0.3, pm = 1)) # Compute quantiles: qstable(pstable(seq(-4, 4, 1), alpha = 1.9, beta = 0.3, pm = 1), alpha = 1.9, beta = 0.3, pm = 1) 20 HyperbolicDistribution HyperbolicDistribution Generalized Hyperbolic Distribution Description A collection and description of functions to compute density, distribution function, quantile function and to generate random variates for three cases of the generalized hyperbolic distribution: the generalized hyperbolic distribution itself, the hperbolic distribution and the normal inverse Gaussian distribution. The functions are: [dpqr]gh [dpqr]hyp hypMode [dpqr]nig hypSlider nigSlider The generalized hyperbolic distribution, The hyperbolic distribution, the hyperbolic mode, The normal inverse Gaussian distribution, interactive hyperbolic distribution display, interactive NIG distribution display. Usage dgh(x, pgh(q, qgh(p, rgh(n, dhyp(x, phyp(q, qhyp(p, rhyp(n, alpha alpha alpha alpha alpha alpha alpha alpha = = = = 1, 1, 1, 1, = = = = 1, 1, 1, 1, beta beta beta beta beta beta beta beta = = = = 0, 0, 0, 0, = = = = 0, 0, 0, 0, delta delta delta delta delta delta delta delta = = = = = = = = 1, 1, 1, 1, mu mu mu mu 1, 1, 1, 1, mu mu mu mu = = = = = = = = 0, 0, 0, 0, lambda lambda lambda lambda 0, 0, 0, 0, pm pm pm pm = = = = = = = = 1) 1) 1) 1) c(1, c(1, c(1, c(1, 2, 2, 2, 2, 3, 3, 3, 3, 4)) 4), ...) 4), ...) 4)) hypMode(alpha = 1, beta = 0, delta = 1, mu = 0, pm = c(1, 2, 3, 4)) dnig(x, pnig(q, qnig(p, rnig(n, alpha alpha alpha alpha = = = = 1, 1, 1, 1, beta beta beta beta = = = = 0, 0, 0, 0, delta delta delta delta = = = = 1, 1, 1, 1, mu mu mu mu = = = = 0) 0) 0) 0) hypSlider() nigSlider() Arguments alpha, beta, delta, mu, lambda shape parameter alpha; skewness parameter beta, abs(beta) is in the range (0, alpha); scale parameter delta, delta must be zero or positive; location parameter mu, by default 0; and lambda parameter lambda, by default 1. These is the meaning of the parameters in the first parameterization pm=1 which is the default parameterization selection. In the second parameterization, pm=2 alpha and beta take the meaning of the shape parameters (usually HyperbolicDistribution 21 named) zeta and rho. In the third parameterization, pm=3 alpha and beta take the meaning of the shape parameters (usually named) xi and chi. In the fourth parameterization, pm=4 alpha and beta take the meaning of the shape parameters (usually named) a.bar and b.bar. n number of observations. p a numeric vector of probabilities. pm an integer value between 1 and 4 for the selection of the parameterization. The default takes the first parameterization. x, q a numeric vector of quantiles. ... arguments to be passed to the function integrate. Details Generalized Hyperbolic Distibution: The generator rgh is based on the GH algorithm given by Scott (2004). Hyperbolic Distibution: The generator rhyp is based on the HYP algorithm given by Atkinson (1982). Normal Inverse Gaussian Distribution: The random deviates are calculated with the method described by Raible (2000). Value All values for the *gh, *hyp, and *nig functions are numeric vectors: d* returns the density, p* returns the distribution function, q* returns the quantile function, and r* generates random deviates. All values have attributes named "param" listing the values of the distributional parameters. The function hyp*Mode returns the mode in the appropriate parameterization. A numeric value. The functions hypSlider and nigSlider display for educational purposes the densities and probabilities of the hyperbolic and normal inverse Gaussian distributions. Note An undocumented R function for the modified Bessel function K1 named .BesselK1(X) is availalble, which is called by the S-Plus version of the program. Author(s) David Scott for the HYP Generator from R’s "HyperbolicDist" package, Diethelm Wuertz for the Rmetrics R-port. References Atkinson, A.C. (1982); The simulation of generalized inverse Gaussian and hyperbolic random variables, SIAM J. Sci. Stat. Comput. 3, 502–515. 22 SmoothedSplineDistribution Barndorff-Nielsen O. (1977); Exponentially decreasing distributions for the logarithm of particle size, Proc. Roy. Soc. Lond., A353, 401–419. Barndorff-Nielsen O., Blaesild, P. (1983); Hyperbolic distributions. In Encyclopedia of Statistical Sciences, Eds., Johnson N.L., Kotz S. and Read C.B., Vol. 3, pp. 700–707. New York: Wiley. Raible S. (2000); Levy Processes in Finance: Theory, Numerics and Empirical Facts, PhD Thesis, University of Freiburg, Germany, 161 pages. Examples ## SOURCE("fBasics.13B-HyperbolicDistribution") ## hyp xmpBasics("\nStart: Hyperbolic Distribution > ") par(mfcol = c(3, 2), cex = 0.5) set.seed(1953) r = rhyp(1000, alpha = 1, beta = 0.3, delta = 1) plot(r, type = "l", col = "steelblue4", main = "hyp: alpha=1 beta=0.3 delta=1") # Plot empirical density and compare with true density: hist(r, n = 25, probability = TRUE, border = "white", col = "steelblue4") x = seq(-5, 5, 0.25) lines(x, dhyp(x, alpha = 1, beta = 0.3, delta = 1)) # Plot df and compare with true df: plot(sort(r), (1:1000/1000), main = "Probability", col = "steelblue4") lines(x, phyp(x, alpha = 1, beta = 0.3, delta = 1)) # Compute quantiles: qhyp(phyp(seq(-5, 7, 1), alpha = 1, beta = 0.3, delta = 1), alpha = 1, beta = 0.3, delta = 1) # Compute the mode: hypMode(alpha = 1, beta = 0.3, delta = 1) ## nig xmpBasics("\nNext: Normal Inverse Gaussian Distribution > ") set.seed(1953) r = rnig(5000, alpha = 1, beta = 0.3, delta = 1) plot(r, type = "l", col = "steelblue4", main = "nig: alpha=1 beta=0.3 delta=1") # Plot empirical density and compare with true density: hist(r, n = 25, probability = TRUE, border = "white", col = "steelblue4") x = seq(-5, 5, 0.25) lines(x, dnig(x, alpha = 1, beta = 0.3, delta = 1)) # Plot df and compare with true df: plot(sort(r), (1:5000/5000), main = "Probability", col = "steelblue4") lines(x, pnig(x, alpha = 1, beta = 0.3, delta = 1)) # Compute Quantiles: qnig(pnig(seq(-5, 5, 1), alpha = 1, beta = 0.3, delta = 1), alpha = 1, beta = 0.3, delta = 1) SmoothedSplineDistribution Smoothed Spline Distribution SmoothedSplineDistribution 23 Description A collection and description of functions to compute density, distribution function, quantile function and to generate random variates for empirical distributions. Estimates are done using smoothing spline ANOVA models with cubic spline, linear spline, or thin-plate spline marginals for numerical variables. The functions are: dssd pssd qssd rssd Spline smoothed density, spline smoothed probability function, spline smoothed quantiles, random deviates drawn from a ssd. Usage dssd(x, pssd(q, qssd(p, rssd(n, param) param) param) param) Arguments n number of observations. p a numeric vector of probabilities. x, q a numeric vector of quantiles. param a S3 object specifying the parameters as returned by the function ssdFit. Details This is an easy to use version for the functions implemented in Chong Gu’s contributed R package "gss" which is downloadable from the CRAN server. If you require more functionality, e.g. to tailor the parameter estimate we recommend to install Gu’s package. The installation does not interfere with the functions implemented in Rmetrics. Note, before you can use the functions you have to estimate the parameters param using the function ssdFit. Value All values are numeric vectors: d* returns the density, p* returns the distribution function, q* returns the quantile function, and r* generates random deviates. Note The functions do not implement the full functionality provided by R’s contributed package "gss". Only the "cubic" spline method is provided and most of the optional arguments are set to default values. Since the original "gss" package does not interfere with Rmetrics you can load it in parallel. It’s worth to note that the "gss" package does not work under SPlus, but the modified and adapted functions ssdFit and *ssd can be used. 24 DistributionFits Author(s) Chong Gu for the code from R’s contributed package ’gss’, Diethelm Wuertz for the Rmetrics R-port. References Gu C., Wang, J. (2003); emphPenalized likelihood density estimation: Direct cross-validation and scalable approximation, Statistica Sinica, 13, 811–826. Gu C. (2002); Smoothing Spline ANOVA Models, New York, Springer-Verlag. Examples ## SOURCE("fBasics.13C-SmoothedSplineDistribution") ## ssd ## Not run: xmpBasics("\nStart: Spline Smoothed Distribution > ") par(mfcol = c(2, 1), cex = 0.5) set.seed(1953) x = rnorm(1000) param = ssdFit(x) # Plot empirical density and compare with fitted density: hist(x, n = 25, probability = TRUE, border = "white", col = "steelblue4") s = seq(min(x), max(x), 0.1) lines(s, dssd(s, param), lwd = 2) # Plot df and compare with true df: plot(sort(x), (1:1000/1000), main = "Probability", col = "steelblue4") lines(s, pssd(s, param), lwd = 2) # Compute quantiles: qssd(pssd(seq(-3, 3, 1), param), param) ## End(Not run) DistributionFits Parameter Fit of a Distribution Description A collection and description of moment and maximum likelihood estimators to fit the parameters of a distribution. Included are estimators for the Student-t, for the stable, for the generalized hyperbolic hyperbolic, for the normal inverse Gaussian, and for empirical distributions. The functions are: tFit stableFit ghFit hypFit nigFit ssdFit print.ssd MLE parameter fit for a Student t-distribution, MLE and Quantile Method stable parameter fit, MLE parameter fit for a generalized hyperbolic distribution, MLE parameter fit for a hyperbolic distribution, MLE parameter fit for a normal inverse Gaussian distribution, smoothing spline estimation , S3 print method for objects returned from ’ssdFit’. DistributionFits 25 Usage tFit(x, df = 4, doplot = TRUE, span = "auto", title = NULL, description = NULL, ...) stableFit(x, alpha = 1.75, beta = 0, gamma = 1, delta = 0, type = c("q", "mle"), doplot = TRUE, title = NULL, description = NULL) ghFit(x, alpha = 1, beta = 0, delta = 1, mu = 0, lambda = 1, doplot = TRUE, span = "auto", title = NULL, description = NULL, ...) hypFit(x, alpha = 1, beta = 0, delta = 1, mu = 0, doplot = TRUE, span = "auto", title = NULL, description = NULL, ...) nigFit(x, alpha = 1, beta = 0, delta = 1, mu = 0, doplot = TRUE, span = "auto", title = NULL, description = NULL, ...) ## S3 method for class 'fDISTFIT': print(x, ...) ssdFit(x, alpha = 1.4, seed = NULL, title = NULL, description = NULL) ## S3 method for class 'ssd': print(x, ...) Arguments alpha, beta, gamma, delta, mu, lambda [ssdFit] alpha is the parameter defining cross-validation score for smoothing parameter selection. [stable] The parameters are alpha, beta, gamma, and delta: value of the index parameter alpha with alpha = (0,2]; skewness parameter beta, in the range [-1, 1]; scale parameter gamma; and shift parameter delta. [hyp] The parameters are alpha, beta, delta, mu, and and lambda: shape parameter alpha; skewness parameter beta, abs(beta) is in the range (0, alpha); scale parameter delta, delta must be zero or positive; location parameter mu, by default 0; and lambda parameter lambda, by default 1. These is the meaning of the parameters in the first parameterization pm=1 which is the default parameterization selection. In the second parameterization, pm=2 alpha and beta take the meaning of the shape parameters (usually named) zeta and rho. In the third parameterization, pm=3 alpha and beta take the meaning of the shape parameters (usually named) xi and chi. In the fourth parameterization, pm=4 alpha and beta take the meaning of the shape parameters (usually named) a.bar and b.bar. description a character string which allows for a brief description. df [tFit] the number of degrees of freedom for the Student distribution, df > 2, maybe non-integer. By default a value of 4 is assumed. doplot [tFit][hypFit][nigFit] a logical. Should a plot be displayed? 26 DistributionFits seed [ssdFit] Seed to be used for the random generation of "knots." span x-coordinates for the plot, by default 100 values automatically selected and ranging between the 0.001, and 0.999 quantiles. Alternatively, you can specify the range by an expression like span=seq(min, max, times = n), where, min and max are the left and right endpoints of the range, and n gives the number of the intermediate points. title a character string which allows for a project title. type a character string which allows to select the method for parameter estimation: "mle", the maximum log likelihood approach, or "qm", McCulloch’s quantile method. x [*Fit] a numeric vector. [print.ssd] an S3 object of class "ssd" as returned from the function ssdFit. ... parameters parsed to the function density and to the print.ssd function. Details Maximum Likelihood Estimation: The function nlm is used to minimize the "negative" maximum log-likelihood function. nlm carries out a minimization using a Newton-type algorithm. Spline Smoothed Distribution: Estimates are done using smoothing spline ANOVA models with cubic spline marginals for numerical variables. Value The functions tFit, hypFit and nigFit return a list with the following components: estimate the point at which the maximum value of the log liklihood function is obtained. minimum the value of the estimated maximum, i.e. the value of the log liklihood function. code an integer indicating why the optimization process terminated. 1: relative gradient is close to zero, current iterate is probably solution; 2: successive iterates within tolerance, current iterate is probably solution; 3: last global step failed to locate a point lower than estimate. Either estimate is an approximate local minimum of the function or steptol is too small; 4: iteration limit exceeded; 5: maximum step size stepmax exceeded five consecutive times. Either the function is unbounded below, becomes asymptotic to a finite value from above in some direction or stepmax is too small. gradient the gradient at the estimated maximum. steps number of function calls. The function ssdFit returns an S3 object of class "ssd" which contains as information the parameters to compute density, probability, quantiles, and random deviates for the functions [dpqr]ssd. DistributionFits 27 Note The function ssdFit does not implement the full functionality provided by R’s contributed package "gss". Only the "cubic" spline method is provided and most of the optional arguments are set to default values. Since the original "gss" package does not interfere with Rmetrics you can load it in parallel, and use the function ssden in place of ssdFit. It’s worth to note that the "gss" package does not work under SPlus, but the modified and adapted functions ssdFit and *ssd can be used. Author(s) Chong Gu for the code from R’s contributed package ’gss’, Diethelm Wuertz for the Rmetrics R-port. Examples ## SOURCE("fBasics.13D-DistributionFits") ## tFit xmpBasics("\nStart: MLE Fit to Student's t Density > ") par(mfrow = c(2,2), cex = 0.7, err = -1) options(warn = -1) # Simulated random variates t(4): set.seed(1953) s = rt(n = 1000, df = 4) # Note, this may take some time. # Starting vector: df.startvalue = 2*var(s)/(var(s)-1) tFit(s, df.startvalue, doplot = TRUE) ## ghFit ## hypFit xmpBasics("\nNext: MLE Fit to Hyperbolic Density > ") # Simulated random variates HYP(1, 0.3, 1, -1): set.seed(1953) s = rhyp(n = 1000, alpha = 1.5, beta = 0.3, delta = 0.5, mu = -1) # Note, this may take some time. # Starting vector (1, 0, 1, mean(s)): hypFit(s, alpha = 1, beta = 0, delta = 1, mu = mean(s), doplot = TRUE, width = 0.5) ## nigFit xmpBasics("\nNext: MLE Fit to Normal Inverse Gaussian Density > ") # Simulated random variates HYP(1.5, 0.3, 0.5, -1.0): set.seed(1953) s = rnig(n = 1000, alpha = 1.5, beta = 0.3, delta = 0.5, mu = -1.0) # Note, this may take some time. # Starting vector (1, 0, 1, mean(s)): nigFit(s, alpha = 1, beta = 0, delta = 1, mu = mean(s), doplot = TRUE) ## ssdFit xmpBasics("\nNext: Smoothed Spline Density > ") set.seed(1953) x = rnorm(1000) ssdFit(x) 28 StylizedFacts StylizedFacts Stylized Facts Description A collection and description of functions to investigate and to plot several stylized facts of economic and financial time series. This includes fat tails, autocorrelations, crosscorrelations, long memory behavior, and the Taylor effect. The functions to display stylized facts are: logpdfPlot qqgaussPlot scalinglawPlot acfPlot pacfPlot ccfPlot lmacfPlot lacfPlot teffectPlot logarithmic density plots, Gaussian quantile quantile plot, scaling behavior plot, autocorrelation function plot, partial autocorrelation function plot, cross correlation function plot, long memory autocorrelation function plot, lagged autocorrelation function plot, Taylor effect plot. Usage logpdfPlot(x, n = 50, doplot = TRUE, type = c("lin-log", "log-log"), ...) qqgaussPlot(x, span = 5, col = "steelblue4", main = "Normal Q-Q Plot", ...) scalinglawPlot(x, span = ceiling(log(length(x)/252)/log(2)), doplot = TRUE, ...) acfPlot(x, ...) pacfPlot(x, ...) ccfPlot(x, y, lag.max = max(2, floor(10*log10(length(x)))), type = c("correlation", "covariance", "partial"), ...) lacfPlot(x, n = 12, lag.max) lmacfPlot(x, lag.max = max(2, floor(10*log10(length(x)))), ci = 0.95, main = "ACF", doprint = TRUE) teffectPlot(x, deltas = seq(from = 0.2, to = 3, by = 0.2), lag.max = 10, ymax = NA, standardize = TRUE) Arguments ci [lmacf] the confidence interval, by default 95 percent, i.e. 0.95. col a character string denoting the plot color, by default "steelblue". deltas [teffectPlot] the exponents, a numeric vector, by default ranging from 0.2 to 3.0 in steps of 0.2. doplot a logical. Should a plot be displayed? doprint a logical, should the results be printed? lag.max maximum lag for which the autocorrelation should be calculated, an integer. main a character string, the title of the plot. StylizedFacts 29 n [lacfPlot] - cr an integer, the number of lags. [logpdfPlot] an integer, the number of break and count points. span [scalinglawPlot] an integer value, determines for the qqgaussPlot the plot range, by default 5, and for the scalingPlot a reasonable number of of points for the scaling range, by default daily data with 252 business days per year are assumed. standardize a logical. Should the vector x be standardized? type a character, either e for "exceedences", d for "distances", or by default b for "both", selecting which plot should be displayed. x, y numeric vectors, [acfPlot][pacfPlot][ccfPlot] a numeric vector or matrix or a univariate or multivariate (not ccf) time series object. ymax [teffectPlot] maximum y-axis value on plot, is.na(ymax) TRUE the value is selected automatically. ... for tsPlot one or more univariate or multivariate time series, else other arguments to be passed. Details Tail Behavior: logpdfPlot and qqgaussPlot are two simple functions which allow a quick view on the tails of a distribution. The first creates a logarithmic or double-logarithmic density plot and returns breaks and counts. For the double logarithmic plot, the negative side of the distribution is reflected onto the positive axis. The second creates a Gaussian Quantile-Quantile plot. Scaling Behavior: The function scalingPlot plots the scaling law of financial time series under aggregation and returns an estimate for the scaling exponent. The scaling behavior is a very striking effect of the foreign exchange market and also other markets expressing a regular structure for the volatility. Considering the average absolute return over individual data periods one finds a scaling power law which relates the mean volatility over given time intervals to the size of these intervals. The power law is in many cases valid over several orders of magnitude in time. Its exponent usually deviates significantly from a Gaussian random walk model which implies 1/2. Autocorrelation Functions: The functions acfPlot, pacfPlot, and ccfPlot plots and estimate autocorrelation, ACF, partial autocorrelation, PACF, and cross-covariance and cross-correlation functions, CCF. The functions allow to get a first view on correlations in and between time series. The functions are synonyme function calls for R’s acf, pacf, and ccf from the the ts package. Long Memory Autocorrelation Function: 30 StylizedFacts The function lmacfPlot plots and estimates the long memory autocorrelation function and computes from the plot the Hurst exponent of a time series. The volatility of financial time series exhibits (in contrast to the logarithmic returns) in almost every financial market a slow ecaying autocorrelation function, ACF. We talk of a long memory if the decay in the ACF is slower than exponential, i.e. the correlation function decreases algebraically with increasing (integer) lag. Thus it makes sense to investigate the decay on a double-logarithmic scale and to estimate the decay exponent. The function lmacf calculates and plots the autocorrelation function of the vector x. If the time series exhibits long memory behaviour, it can easily be seen as a stright line in the plot. This double-logarithmic plot is displayed and a linear regression fit is done from which the intercept and slope ar calculated. From the slope the Hurst exponent is derived. Taylor Effect: The "Taylor Effect" describes the fact that absolute returns of speculative assets have significant serial correlation over long lags. Even more, autocorrelations of absolute returns are typically greater than those of squared returns. From these observations the Taylor effect states, that that the autocorrelations of absolute returns to the the power of delta, abs(x-mean(x))^delta reach their maximum at delta=1. The function teffect explores this behaviour. A plot is created which shows for each lag (from 1 to max.lag) the autocorrelations as a function of the exponent delta. In the case that the above formulated hypothesis is supported, all the curves should peak at the same value around delta=1. Value logpdfPlot returns a list with the following components: breaks, histogram mid-point breaks; counts, histogram counts; fbreaks, fitted Gaussian breaks; fcounts, fitted Gaussian counts. qqgaussPlot returns a Gaussian Quantile-Quantile Plot. scalingPlot returns a list with the following components: exponent, the scaling exponent, a numeric value; fit, a list with the coefficients returned by lsfit, i.e. intercept and X. acfPlot, pacfplot, ccfPlot return an object of class "acf", see acf. lmacfPlot returns a list with the following elements: fit, a list by itself with elements Intercept and slope X, hurst, the Hurst exponent, both are numeric values. lacfPlot returns a list with the following two elements: Rho, the autocorrelation function, lagged, the lagged correlations. teffectPlot returns a numeric matrix of order deltas by max.lag with the values of the autocorrelations. Author(s) Diethelm Wuertz for the Rmetrics R-port. StylizedFacts 31 References Taylor S.J. (1986); Modeling Financial Time Series, John Wiley and Sons, Chichester. Ding Z., Granger C.W.J., Engle R.F. (1993); A long memory proerty of stock market returns and a new model, Journal of Empirical Finance 1, 83. Examples ## SOURCE("fBasics.14A-StylizedFacts") ## logpdfPlot xmpBasics("\nStart: log PDF Plot > ") # Plot the log-returns of the NYSE Composite Index # and compare with the Gaussian Distribution: par(mfrow = c(2, 2)) data(nyseres) # Extract from data.frame: x = nyseres[, 1] logpdfPlot(x, main = "log PDF Plot") # loglogpdfPlot # Plot the log-returns of the NYSE Composite Index # and compare with the Gaussian Distribution: logpdfPlot(x, type = "log-log", main = "log-log PDF Plot") ## qqgaussPlot xmpBasics("\nNext: QQ Normal Plot > ") # Create a Gaussian Quantile-Quantile plot # for the NYSE Composite Index log-returns: qqgaussPlot(x) ## scalinglawPlot xmpBasics("\nNext: Scaling Law Plot > ") # Investigate and Plot the Scaling Law # for the NYSE Composite Index log-returns: scalinglawPlot(x) ## acfPlot xmpBasics("\nNext: Auto-Correlation Function Plot > ") data(EuStockMarkets) par(mfrow = c(2, 1)) returns.ftse = diff(log(EuStockMarkets[,"FTSE"])) returns.dax = diff(log(EuStockMarkets[,"DAX"])) acfPlot(x = returns.ftse, main = "FTSE Autocorrelation") ## ccfPlot xmpBasics("\nNext: Cross-Correlation Function Plot > ") ccfPlot(x = returns.ftse, y = returns.dax, main="FTSE - DAX Crosscorrelation") ## lmacfPlot xmpBasics("\nNext: Long-Memory ACF Plot > ") # Estimate and plot the Long Memory ACF of the DAX volatilities # and evaluate the Hurst exponent of a time series: par(mfrow = c(2, 1)) lmacfPlot(abs(returns.dax), main = "DAX") ## teffectPlot - 32 BasicStatistics xmpBasics("\nNext: Taylor Effect Plot > ") # Estimate and plot the Taylor Effect for the # log returns of the NYSE Compositie Index. teffectPlot(returns.dax) teffectPlot(returns.ftse) BasicStatistics Basic Statistics Summary Description A collection and description of functions to compute basic statistical properties. Missing functions in R to calculate skewness and kurtosis are added, a function which creates a summary statistics, and functions to calculate column and row statistics. The functions are: skewness kurtosis basicStats rowStats colStats rowAvgs colAvgs rowVars colVars rowStdevs colStdevs rowSkewness colSkewness rowKurtosis colKurtosis rowCumsums colCumsums returns value of skewness, returns value of kurtosis, computes an overview of basic statistical values, calculates row statistics, calculates column statistics, calculates row means, calculates column means, calculates row variances, calculates column variances, calculates row standard deviations, calculates column standard deviations, calculates row skewness, calculates column skewness, calculates row kurtosis, calculates column kurtosis, calculates row cumulated Sums, calculates column cumulated Sums. For SPLUS Compatibility: stdev Returns the standard deviation of a vector or matrix. Usage stdev(x, na.rm = FALSE) skewness(x, ...) ## Default S3 method: skewness(x, na.rm = FALSE, method = c("moment", "fisher"), ...) ## S3 method for class 'data.frame': skewness(x, ...) ## S3 method for class 'POSIXct': skewness(x, ...) ## S3 method for class 'POSIXlt': BasicStatistics 33 skewness(x, ...) kurtosis(x, ...) ## Default S3 method: kurtosis(x, na.rm = FALSE, method = c("excess", "moment", "fisher"), ...) ## S3 method for class 'data.frame': kurtosis(x, ...) ## S3 method for class 'POSIXct': kurtosis(x, ...) ## S3 method for class 'POSIXlt': kurtosis(x, ...) basicStats(x, ci = 0.95, column = 1) rowStats(x, FUN, na.rm = FALSE, ...) rowAvgs(x, na.rm = FALSE, ...) rowVars(x, na.rm = FALSE, ...) rowStdevs(x, na.rm = FALSE, ...) rowSkewness(x, na.rm = FALSE, ...) rowKurtosis(x, na.rm = FALSE, ...) rowCumsums(x, na.rm = FALSE, ...) colStats(x, FUN, na.rm = FALSE, ...) colAvgs(x, na.rm = FALSE, ...) colVars(x, na.rm = FALSE, ...) colStdevs(x, na.rm = FALSE, ...) colSkewness(x, na.rm = FALSE, ...) colKurtosis(x, na.rm = FALSE, ...) colCumsums(x, na.rm = FALSE, ...) Arguments ci column FUN na.rm method x confidence interval, a numeric value, by default 0.95, i.e. 95 percent. [basicStats] which column should be selected from the input matrix, data frame or timeSeries object. By default an integer value set to 1. [colStats][rowStats the statistical function to be applied. a logical. Should missing values be removed? [kurtosis][skewness] a character string which specifies the method of computation. These are either "moment" or "fisher", kurtosis allows in addition for "excess". If "excess" is selected, then the value of the kurtosis is computed by the "moment" method and a value of 3 will be subtracted. The "moment" method is based on the definitions of skewness and kurtosis for distributions; these forms should be used when resampling (bootstrap or jackknife). The "fisher" method correspond to the usual "unbiased" definition of sample variance, although in the case of skewness and kurtosis exact unbiasedness is not possible. a numeric vector, or a matrix for column statistics. [basicStats] allows also a matrix, data.frame or timeSeries as input. In this case only the first column of data will be considered and a a warning will be printed. 34 BasicStatistics ... arguments to be passed. Value skewness kurtosis return the value of the statistics, a numeric value. An attribute which reports the used method is added. basicsStats returns data frame with the following entries and row names: nobs, NAs, Minimum, Maximum , 1. Quartile, 3. Quartile, Mean, Median, Sum, SE Mean, LCL Mean, UCL Mean, Variance, Stdev, Skewness, Kurtosis. rowStats rowAvgs rowVars rowStdevs rowSkewness rowKurtosis rowCumsum compute sample statistics by column. Missing values can be handled. colStats colAvgs colVars colStdevs, colSkewness colKurtosis colCumsum compute sample statistics by column. Missing values can be handled. Note R’s-base package contains a function colMeans with an additional argument dim=1. Therefore, the function used here to compute column means (averages) is named colAvgs. The function stdev computes the standard deviation for a vector or matrix and was introduced for SPlus compatibility. Under R use the function sd. Author(s) Diethelm Wuertz for the Rmetrics R-port. Examples ## SOURCE("fBasics.15A-BasicStatistics") ## basicStats xmpBasics("\nStart: Basic Statistics of log-Returns > ") # Data NYSE Composite Index: data(nyseres) basicStats(nyseres) PortableRandomInnovations ## ## ## ## 35 mean var skewness kurtosis xmpBasics("\nNext: Moments, Skewness and Kurtosis > ") # Mean, Variance: mean(nyseres) var(nyseres) # Skewness, Kurtosis: class(nyseres) skewness(nyseres[, 1]) kurtosis(nyseres[, 1]) PortableRandomInnovations Generator for Portable Random Innovations Description A collection and description of functions to generate portable random innovations. The functions run under R and SPlus and generate the same sequence of random numbers. Supported are uniform, normal and Student-t distributed random numbers. The functions are: set.lcgseed get.lcgseed runif.lcg rnorm.lcg rt.lcg Set initial random seed, Get the current valus of the random seed, Uniform linear congruational generator, Normal linear congruational generator, Student-t linear congruential generator. Usage set.lcgseed(seed = 4711) get.lcgseed() runif.lcg(n, min = 0, max = 1) rnorm.lcg(n, mean = 0, sd = 1) rt.lcg(n, df) Arguments df number of degrees of freedom, a positive integer, maybe non-integer. mean, sd means and standard deviation of the normal distributed innovations. min, max lower and upper limits of the uniform distributed innovations. seed an integer value, the random number seed. n an integer, the number of random innovations to be generated. 36 HypothesisTesting Details A simple portable random number generator for use in R and SPlus. We recommend to use this generator only for comparisons of calculations in R and Splus. The generator is a linear congruential generator with parameters LCG(a=13445, c=0, m=2^311, X=0). It is a simple random number generator which passes the bitwise randomness test. Value A vector of generated random innovations. The value of the current seed is stored in the variable lcg.seed. Author(s) Diethelm Wuertz for the Rmetrics R-port. References Altman, N.S. (1988); Bitwise Behavior of Random Number Generators, SIAM J. Sci. Stat. Comput., 9(5), September, 941–949. Examples ## SOURCE("fSeries.15B-PortableRandomInnovations") ## set.lcgseed xmpBasics("\nStart: Set Initial Seed >") set.lcgseed(seed = 65890) ## runif.lcg - rnorm.lcg - rt.lcg xmpBasics("\nNext: Create Random Numbers >") cbind(runif.lcg(10), rnorm.lcg(10), rt.lcg(10, df = 4)) ## get.lcgseed xmpBasics("\nNext: What is the current value of the seed? >") get.lcgseed() ## Note, to overwrite rnorm, use # rnorm = rnorm.lcg # Going back to rnorm # rm(rnorm) HypothesisTesting Tests Class Representation and Utilities Description Class representation and methods for objects of class ’fHTEST’. The class representation and methods are: fHTEST show Representation for an S4 object of class "fHTEST", S4 print method. HypothesisTesting 37 Utility Functions: pPlot pTable qTable General finite sample probability plot, interpolated probabilities from finite sample table, interpolated quantiles from finite sample table. Usage show.fHTEST(object) pPlot(X, nN = 100, nStat = 100, logN = TRUE, logStat = FALSE, fill = FALSE, linear = TRUE, digits = 8, doplot = TRUE, ...) pTable(X, Stat, N, digits = 4) qTable(X, p, N, digits = 4) Arguments digits an integer value with the number of rounding digits. doplot a logical flag. Should a plot be displayed? fill [pPlot] a logical flag deciding if missing data should be filled with asymptotic values zero and one. [pPlot] a logical flag indicating the type of interpolation. logN, logStat [pPlot] two logical flags deciding if the data should be on a logarithmic scale or not. linear N an integer value or vector of sample sizes. nN, nStat [pPlot] two integer values with the size of the table. object [show] an S4 object of class "fHTEST". p a numeric value or vector of probabilities. Stat a numeric value or vector of quantiles or statistic values. X [pPlot][*Table] a data frame or matrix of a finite sample test table. ... [pPlot] additional arguments to be passed. Value In contrast to R’s output report from S3 objects of class "htest" a different output report is produced. The tests return an S4 object of class "fHTEST". The object contains the following slots: @call the function call. @data the data as specified by the input argument(s). @test a list whose elements contail the results from the statistical test. The information provided is similar to a list object of class"htest". 38 HypothesisTesting @title a character string with the name of the test. This can be overwritten specifying a user defined input argument. @description a character string with an optional user defined description. By default just the current date when the test was applied will be returned. statistic the value(s) of the test statistic. p.value the p-value(s) of the test. parameters a numeric value or vector of parameters. estimate a numeric value or vector of sample estimates. conf.int a numeric two row vector or matrix of 95 method a character string indicating what type of test was performed. data.name a character string giving the name(s) of the data. The functions pPlot, pTable, and qTable plot and iterpolate finite sample test statistic data from a table. The table is a data frame or a matrix where columns denote the size and rows the probabilities. The column and row names must hold the sizes and probabilities as character strings. The values of the matrix hold the statistic values. Author(s) Diethelm Wuertz for the Rmetrics R-port. Examples ## SOURCE("fBasics.15C-HypothesisTesting") ## fHTEST getClass("fHTEST") ## pPlot ## jbTable # Interpolated plot of Small Jarque Bera Table: X = jbTable(type = "LM", size = "small") pPlot(X, linear = TRUE, logStat = TRUE) pPlot(X, linear = TRUE, logStat = TRUE, fill = TRUE, main = "JB LM") pPlot(X, linear = FALSE, logStat = TRUE) pPlot(X, linear = FALSE, logStat = TRUE, fill = TRUE) ## [pq]Table ## jbTable # Jarque Bera B q and p Table: X = jbTable(type = "LM", size = "small") p = (1:99)/100 plot(qTable(X, p, N = 100), p, type = "b") Stat = seq(0.01, 15, length = 100) plot(Stat, pTable(X, Stat, N = 100), type = "b") OneSampleTests 39 OneSampleTests One Sample Tests Description A collection and description of functions of one sample tests for testing normality of for detecting non-randomness in observations. The functions for testing normality are: normalTest ksnormTest shapiroTest jarqueberaTest dagoTest test suite for some normality tests, Kolmogorov-Smirnov normality test, Shapiro-Wilk’s test for normality, Jarque–Bera test for normality, D’Agostino normality test. Functions for high precision Jarque Bera LM and ALM tests: jbTable pjb qjb jbTest Table of finite sample p values for the JB test, Computes probabilities for the Jarque Bera Test, Computes quantiles for the Jarque Bera Test, Performs finite sample adjusted JB LM and ALM test. Additional functions for testing normality from the ’nortest’ package: adTest cvmTest lillieTest pchiTest sfTest Anderson–Darling normality test, Cramer–von Mises normality test, Lilliefors (Kolmogorov-Smirnov) normality test, Pearson chi–square normality test, Shapiro–Francia normality test. More tests ... runsTest gofnorm Runs test for detecting non-randomness, Prints a report on 13 different tests of normality. Usage normalTest(x, method = c("ks", "sw", "jb", "da")) ksnormTest(x, title = NULL, description = NULL) shapiroTest(x, title = NULL, description = NULL) jarqueberaTest(x, title = NULL, description = NULL) dagoTest(x, title = NULL, description = NULL) jbTable(type = c("LM", "ALM"), size = c("all", "small")) pjb(q, N = Inf, type = c("LM", "ALM")) qjb(p, N = Inf, type = c("LM", "ALM")) jbTest(x, title = NULL, description = NULL) 40 OneSampleTests adTest(x, title = NULL, description = NULL) cvmTest(x, title = NULL, description = NULL) lillieTest(x, title = NULL, description = NULL) pchiTest(x, title = NULL, description = NULL) sfTest(x, title = NULL, description = NULL) runsTest(x) gofnorm(x, doprint = TRUE) Arguments description optional description string, or a vector of character strings. doprint if TRUE, an exhaustive report is printed. method [normalTest] indicates four different methods for the normality test, "ks" for the KolmogorovSmirnov one–sample test, "sw" for the Shapiro-Wilk test, "jb" for the JarqueBera Test, and "da" for the D’Agostino Test. The default value is "ks". N an integer value specifying the sample size. p a numeric vector of probabilities. Missing values are not allowed. q vector of quantiles or test statistics. Missing values are not allowed. size [jbTable] a character string denoting the size of the table. If set to "all" then all data are used from the table, if set to "small" then only a small part of the data will be returned. title an optional title string, if not specified the inputs data name is deparsed. type [jbTest][pjb][qjb] the same for the Jarque Bera test statistic. "LM" denotes the Lagrange multiplier test, and "ALM" the adjusted Lagrange multiplier test. x a numeric vector of data values or a S4 object of class timeSeries. Details The hypothesis tests may be of interest for many financial and economic applications, especially for the investigation of univariate time series returns. Normal Tests: Several tests for testing if the records from a data set are normally distributed are available. The input to all these functions may be just a vector x or a univariate time series object x of class timeSeries. First there exists a wrapper function which allows to call one from two normal tests either the Shapiro–Wilks test or the Jarque–Bera test. This wrapper was introduced for compatibility with S-Plus’ FinMetrics package. Also available are the Kolmogorov–Smirnov one sample test and the D’Agostino normality test. The remaining five normal tests are the Anderson–Darling test, the Cramer–von Mises test, the Lilliefors (Kolmogorov–Smirnov) test, the Pearson chi–square test, and the Shapiro–Francia test. They are calling functions from R’s contributed package nortest. The difference to the original OneSampleTests 41 test functions implemented in R and from contributed R packages is that the Rmetrics functions accept time series objects as input and give a more detailed output report. The Anderson-Darling test is used to test if a sample of data came from a population with a specific distribution, here the normal distribution. The adTest goodness-of-fit test can be considered as a modification of the Kolmogorov–Smirnov test which gives more weight to the tails than does the ksnormTest. Runs Test: The runs test can be used to decide if a data set is from a random process. A run is defined as a series of increasing values or a series of decreasing values. The number of increasing, or decreasing, values is the length of the run. In a random data set, the probability that the (i+1)-th value is larger or smaller than the i-th value follows a binomial distribution, which forms the basis of the runs test. Report from gofnorm Tests: The function reports about the following goodness-of-fit tests for normality: 1 2 3 4 5 6 7 8 9 10 11 12 13 Omnibus Moments Test for Normality Geary’s Test of Normality Studentized Range for Testing Normality D’Agostino’s D-Statistic Test of Normality Kuiper V-Statistic Modified to Test Normality Watson U-Squared-Statistic Modified to Test Normality Durbin’s Exact Test (Normal Distribution Anderson-Darling Statistic Modified to Test Normality Cramer-Von Mises W-Squared-Statistic to Test Normality Kolmogorov-Smirnov D-Statistic to Test Normality Kolmogorov-Smirnov D-Statistic (Lilliefors Critical Values) Chi-Square Test of Normality (Equal Probability Classes) Shapiro-Francia W-Test of Normality for Large Samples The functions are implemented from the GRASS GIS software package an Open Source project avalaible under the GNU GPL license. Value In contrast to R’s output report from S3 objects of class "htest" a different output report is produced. The tests here return an S4 object of class "fHTEST". The object contains the following slots: @call the function call. @data the data as specified by the input argument(s). @test a list whose elements contail the results from the statistical test. The information provided is similar to a list object of class"htest". @title a character string with the name of the test. This can be overwritten specifying a user defined input argument. @description a character string with an optional user defined description. By default just the current date when the test was applied will be returned. 42 OneSampleTests statistic p.value parameters estimate conf.int method data.name the value(s) of the test statistic. the p-value(s) of the test. a numeric value or vector of parameters. a numeric value or vector of sample estimates. a numeric two row vector or matrix of 95 a character string indicating what type of test was performed. a character string giving the name(s) of the data. The meaning of the elements of the @test slot is the following: ksnormTest returns the values for the ’D’ statistic and p-values for the three alternatives ’two-sided, ’less’ and ’greater’. shapiroTest returns the values for the ’W’ statistic and the p-value. jarqueberaTest jbTest returns the values for the ’Chi-squared’ statistic with 2 degrees of freedom, and the asymptotic p-value. jbTest is the finite sample version of the Jarque Bera Lagrange multiplier, LM, and adjusted Lagrange multiplier test, ALM. dagoTest returns the values for the ’Chi-squared’, the ’Z3’ (Skewness) and ’Z4’ (Kurtosis) statistic together with the corresponding p values. adTest returns the value for the ’A’ statistic and the p-value. cvmTest returns the value for the ’W’ statistic and the p-value. lillieTest returns the value for the ’D’ statistic and the p-value. pchiTest returns the value for the ’P’ statistic and the p-values for the adjusted and not adjusted test cases. In addition the number of classes is printed, taking the default value due to Moore (1986) computed from the expression n.classes = ceiling(2 * (n^(2/5))), where n is the number of observations. sfTest returns the value for the ’W’ statistic and the p-value. Note Some of the test implementations are selected from R’s ctest and nortest packages. Author(s) R-core team for the tests from R’s ctest package, Adrian Trapletti for the runs test from R’s tseries package, Juergen Gross for the normal tests from R’s nortest package, James Filliben for the Fortran program producing the runs report, Paul Johnson for the Fortran program producing the gofnorm report, Diethelm Wuertz and Helmut Katzgraber for the finite sample JB tests, Diethelm Wuertz for the Rmetrics R-port. OneSampleTests 43 References Anderson T.W., Darling D.A. (1954); A Test of Goodness of Fit, JASA 49:765–69. Conover, W. J. (1971); Practical nonparametric statistics, New York: John Wiley & Sons. D’Agostino R.B., Pearson E.S. (1973); Tests for Departure from Normality, Biometrika 60, 613–22. D’Agostino R.B., Rosman B. (1974); The Power of Geary’s Test of Normality, Biometrika 61, 181–84. Durbin J. (1961); Some Methods of Constructing Exact Tests, Biometrika 48, 41–55. Durbin,J. (1973); Distribution Theory Based on the Sample Distribution Function, SIAM, Philadelphia. Geary R.C. (1947); Testing for Normality; Biometrika 36, 68–97. Lehmann E.L. (1986); Testing Statistical Hypotheses, John Wiley and Sons, New York. Linnet K. (1988); Testing Normality of Transformed Data, Applied Statistics 32, 180–186. Moore, D.S. (1986); Tests of the chi-squared type, In: D’Agostino, R.B. and Stephens, M.A., eds., Goodness-of-Fit Techniques, Marcel Dekker, New York. Shapiro S.S., Francia R.S. (1972); An Approximate Analysis of Variance Test for Normality, JASA 67, 215–216. Shapiro S.S., Wilk M.B., Chen V. (1968); A Comparative Study of Various Tests for Normality, JASA 63, 1343–72. Thode H.C. (2002); Testing for Normality, Marcel Dekker, New York. Weiss M.S. (1978); Modification of the Kolmogorov-Smirnov Statistic for Use with Correlated Data, JASA 73, 872–75. Wuertz D., Katzgraber H.G. (2005); Precise finite-sample quantiles of the Jarque-Bera adjusted Lagrange multiplier test, ETHZ Preprint. Examples ## SOURCE("fBasics.15D-OneSampleTests") ## Series: xmpBasics("\nStart: Create Series > ") x = rnorm(100) ## ksnormTests xmpBasics("\nNext: Kolmogorov - Smirnov One-Sampel Test > ") ksnormTest(x) ## shapiroTest xmpBasics("\nNext: Shapiro - Wilk Test > ") shapiroTest(x) ## jarqueberaTest xmpBasics("\nNext: Jarque - Bera Test > ") jarqueberaTest(x) jbTest(x) ## dagoTest xmpBasics("\nNext: D'Agostino Test > ") dagoTest(x) ## adTest - 44 TwoSampleTests xmpBasics("\nNext: Anderson - Darling Test > ") adTest(x) ## cvmTest xmpBasics("\nNext: Cramer - von Mises Test > ") cvmTest(x) ## lillieTest xmpBasics("\nNext: Lillifors (KS) Test > ") lillieTest(x) ## pchiTest xmpBasics("\nNext: Pearson Chi-Squared Test > ") pchiTest(x) ## sfTest xmpBasics("\nNext: Shapiro - Franca Test > ") sfTest(x) ## gofnorm xmpBasics("\nNext: Goodness-of-Fit Test for Normality > ") gofnorm(x, doprint = TRUE) ## runsTest xmpBasics("\nNext: Runs Test > ") runsTest(x) Two Sample Tests TwoSampleTests Description A collection and description of functions for two sample statistical tests. The functions allow to test for distributional equivalence, for difference in location, variance and scale, and for correlations. Distributional Equivalence: ks2Test Two sample Kolmogorov-Smirnov test. Difference in Locations: tTest kw2Test The t test, the Kruskal–Wallis test. Difference in Variance: varfTest bartlett2Test fligner2Test Difference in Scale: The variance F test, the Bartlett test, the Fligner–Killeen test. TwoSampleTests 45 ansariTest moodTest The Ansari–Bradley test, the Mood test. Correlations: pearsonTest kendallTest spearmanTest Pearson’s coefficient, Kendall’s rho, Spearman’s rho. Test Distributions: [dpq]ansariw Distribution of the Ansari W statistic. Usage ks2Test(x, y, title = NULL, description = NULL) tTest(x, y, title = NULL, description = NULL) kw2Test(x, y, title = NULL, description = NULL) varfTest(x, y, title = NULL, description = NULL) bartlett2Test(x, y, title = NULL, description = NULL) fligner2Test(x, y, title = NULL, description = NULL) ansariTest(x, y, title = NULL, description = NULL) moodTest(x, y, title = NULL, description = NULL) pearsonTest(x, y, title = NULL, description = NULL) kendallTest(x, y, title = NULL, description = NULL) spearmanTest(x, y, title = NULL, description = NULL) dansariw(x = NULL, m, n = m) pansariw(q = NULL, m, n = m) qansariw(p, m, n = m) Arguments description optional description string, or a vector of character strings. m, n [*ansariw] - p [qansariw] a numeric vector of quantiles. q [pansariw] a numeric vector of quantiles. title an optional title string, if not specified the inputs data name is deparsed. x, y a numeric vector of data values. [bartlett2Test][fligner2Test][kw2Test] here x is a list, where each element is either a vector or an object of class timeSeries. y is only used for the two–sample test situation, where x and y are two vectors or objects of class timeSeries. 46 TwoSampleTests [dansariw] a numeric vector of quantiles. Details The tests may be of interest for many financial and economic applications, especially for the comparison of two time series. The tests are grouped according to their functionalities. Distributional Equivalence: The test ks2Test performs a Kolmogorov–Smirnov two sample test that the two data samples x and y come from the same distribution, not necessarily a normal distribution. That means that it is not specified what that common distribution is. Differences in Location: The function tTest can be used to determine if the two sample means are equal for unpaired data sets. Two variants are used, assuming equal or unequal variances. The function kw2Test performs a Kruskal-Wallis rank sum test of the null hypothesis that the central tendencies or medians of two samples are the same. The alternative is that they differ. Note, that it is not assumed that the two samples are drawn from the same distribution. It is also worth to know that the test assumes that the variables under consideration have underlying continuous distributions. Differences in Variances: The function varfTest can be used to compare variances of two normal samples performing an F test. The null hypothesis is that the ratio of the variances of the populations from which they were drawn is equal to one. The function bartlett2Test performs the Bartlett’s test of the null hypothesis that the variances in each of the samples are the same. This fact of equal variances across samples is also called homogeneity of variances. Note, that Bartlett’s test is sensitive to departures from normality. That is, if the samples come from non-normal distributions, then Bartlett’s test may simply be testing for non-normality. The Levene test (not yet implemented) is an alternative to the Bartlett test that is less sensitive to departures from normality. The function fligner2Test performs the Fligner-Killeen test of the null that the variances in each of the two samples are the same. Differences in Scale: The function ansariTest performs the Ansari–Bradley two–sample test for a difference in scale parameters. Note, that we have completely reimplemented this test based on the statistcs and pvalues computed from algorithm AS 93. The test returns for any sizes of the series x and y the exact p value together with its asymptotic limit. The test procedure is not limited to sizes shorter of length 50 as this is the case for the function ansari.Test implemented in R’s stats package. For the test statistics the following functions are available: dansariw, pansariw, and qansariw. The function codemoodTest, is another test which performs a two–sample test for a difference in scale parameters. The underlying model is that the two samples are drawn from f(x-l) and f((xl)/s)/s, respectively, where l is a common location parameter and s is a scale parameter. The null TwoSampleTests 47 hypothesis is s=1. Correlations: The function correlationTest tests for association between paired samples, using Pearson’s product moment correlation coefficient, The function kendallTest performs Kendall’s tau test The function spearmanTest performs Spearman’s rho test. Value In contrast to R’s output report from S3 objects of class "htest" a different output report is produced. The classical tests presented here return an S4 object of class "fHTEST". The object contains the following slots: @call the function call. @data the data as specified by the input argument(s). @test a list whose elements contail the results from the statistical test. The information provided is similar to a list object of class"htest". @title a character string with the name of the test. This can be overwritten specifying a user defined input argument. @description a character string with an optional user defined description. By default just the current date when the test was applied will be returned. statistic the value(s) of the test statistic. p.value the p-value(s) of the test. parameters a numeric value or vector of parameters. estimate a numeric value or vector of sample estimates. conf.int a numeric two row vector or matrix of 95 method a character string indicating what type of test was performed. data.name a character string giving the name(s) of the data. Note Some of the test implementations are selected from R’s ctest package. Author(s) R-core team for the tests from R’s ctest package, Diethelm Wuertz for the Rmetrics R-port. References Conover, W. J. (1971); Practical nonparametric statistics, New York: John Wiley & Sons. Durbin J. (1961); Some Methods of Constructing Exact Tests, Biometrika 48, 41–55. Durbin,J. (1973); Distribution Theory Based on the Sample Distribution Function, SIAM, Philadelphia. Lehmann E.L. (1986); Testing Statistical Hypotheses, John Wiley and Sons, New York. Moore, D.S. (1986); Tests of the chi-squared type, In: D’Agostino, R.B. and Stephens, M.A., eds., Goodness-of-Fit Techniques, Marcel Dekker, New York. 48 fBasicsTools Examples ## SOURCE("fBasics.15E-TwoSampleTests") ## x, y xmpBasics("\nStart: Create two Samples > ") x = rnorm(50) y = rnorm(50) ## ks2Test xmpBasics("\nNext: Distributional Tests > ") ks2Test(x, y) ## tTest | kw2Test xmpBasics("\nNext: Location Tests > ") tTest(x, y) kw2Test(x, y) ## varfTest, bartlett2Test | fligner2Test xmpBasics("\nNext: Variance Tests > ") varfTest(x, y) bartlett2Test(x, y) fligner2Test(x, y) ## ansariTest | moodTest xmpBasics("\nNext: Scale Tests > ") ansariTest(x, y) moodTest(x, y) ## pearsonTest | kendallTest | spearmanTest xmpBasics("\nNext: Correlation Tests > ") pearsonTest(x, y) kendallTest(x, y) spearmanTest(x, y) fBasicsTools fBasics Tools Description Tools used in the fBasics library. Usage xmpfBasics() xmpBasics(prompt = "") Arguments prompt the string printed when prompting the user for input. fBasicsTools 49 Value xmpfBasics Popups the example menu. xmpBasics Nothing, the default, or the the prompt if you have set xmpBasics = readline on the command prompt. Note The example in the manual pages may be interactive and ask for input from the user. To achieve this you have to type on the command line: xmpBasics = readline Author(s) Diethelm Wuertz for the Rmetrics R-port. Examples ## SOURCE("fBasics-xmpTools") Index ∗Topic datasets fBasicsData, 11 ZivotWangData, 14 ∗Topic data MarketStatistics, 1 TimeSeriesImport, 4 ∗Topic distribution DistributionFits, 25 HyperbolicDistribution, 19 SmoothedSplineDistribution, 23 StableDistribution, 16 ∗Topic hplot PlotFunctions, 8 StylizedFacts, 28 ∗Topic htest HypothesisTesting, 37 OneSampleTests, 39 TwoSampleTests, 45 ∗Topic programming fBasicsTools, 49 PortableRandomInnovations, 36 ∗Topic univar BasicStatistics, 32 .PhiStable (fBasicsData), 11 ciaIndicators (MarketStatistics), 1 circlesPlot (PlotFunctions), 8 colAvgs (BasicStatistics), 32 colCumsums (BasicStatistics), 32 colKurtosis (BasicStatistics), 32 colorTable (PlotFunctions), 8 colSkewness (BasicStatistics), 32 colStats (BasicStatistics), 32 colStdevs (BasicStatistics), 32 colVars (BasicStatistics), 32 contourPlot (PlotFunctions), 8 CPI.dat (ZivotWangData), 14 cvmTest (OneSampleTests), 39 dagoTest (OneSampleTests), 39 dansariw (TwoSampleTests), 45 densityPlot (PlotFunctions), 8 dgh (HyperbolicDistribution), 19 dhyp (HyperbolicDistribution), 19 DistributionFits, 25 dnig (HyperbolicDistribution), 19 DowJones30 (ZivotWangData), 14 dssd (SmoothedSplineDistribution), 23 dstable (StableDistribution), 16 dsymstb (StableDistribution), 16 acf, 31 acfPlot (StylizedFacts), 28 adTest (OneSampleTests), 39 ansariTest (TwoSampleTests), 45 audusd (fBasicsData), 11 economagicImport (TimeSeriesImport), 4 fBasicsData, 11 fBasicsTools, 49 fdax9710 (fBasicsData), 11 fdax97m (fBasicsData), 11 fDISTFIT (DistributionFits), 25 fDISTFIT-class (DistributionFits), 25 fHTEST (HypothesisTesting), 37 fHTEST-class (HypothesisTesting), 37 fligner2Test (TwoSampleTests), 45 ford.s (ZivotWangData), 14 bartlett2Test (TwoSampleTests), 45 BasicStatistics, 32 basicStats (BasicStatistics), 32 bmwres (fBasicsData), 11 ccfPlot (StylizedFacts), 28 characterTable (PlotFunctions), 8 ciaByCountry (MarketStatistics), 1 ciaByIndicator (MarketStatistics), 1 ciaCountries (MarketStatistics), 1 ciaFactbook (fBasicsData), 11 50 INDEX forecastsImport (TimeSeriesImport), 4 fredImport (TimeSeriesImport), 4 fWEBDATA (TimeSeriesImport), 4 fWEBDATA-class (TimeSeriesImport), 4 get.lcgseed (PortableRandomInnovations), 36 ghFit (DistributionFits), 25 gofnorm (OneSampleTests), 39 greyPal (PlotFunctions), 8 highFreq3M.df (ZivotWangData), 14 histPlot (PlotFunctions), 8 hp.s (ZivotWangData), 14 HyperbolicDistribution, 19 hypFit (DistributionFits), 25 hypMode (HyperbolicDistribution), 19 HypothesisTesting, 37 hypSlider (HyperbolicDistribution), 19 IP.dat (ZivotWangData), 14 jarqueberaTest (OneSampleTests), 39 jbALM (fBasicsData), 11 jbLM (fBasicsData), 11 jbTable (OneSampleTests), 39 jbTest (OneSampleTests), 39 kendallTest (TwoSampleTests), 45 keystatsImport (TimeSeriesImport), 4 ks2Test (TwoSampleTests), 45 ksnormTest (OneSampleTests), 39 kurtosis (BasicStatistics), 32 kw2Test (TwoSampleTests), 45 51 nigSlider (HyperbolicDistribution), 19 nlm, 27 normalTest (OneSampleTests), 39 nyse (fBasicsData), 11 nyseres (fBasicsData), 11 OneSampleTests, 39 pacfPlot (StylizedFacts), 28 pansariw (TwoSampleTests), 45 pchiTest (OneSampleTests), 39 pearsonTest (TwoSampleTests), 45 perspPlot (PlotFunctions), 8 pgh (HyperbolicDistribution), 19 phyp (HyperbolicDistribution), 19 pjb (OneSampleTests), 39 plotcharacterTable (PlotFunctions), 8 PlotFunctions, 8 pnig (HyperbolicDistribution), 19 PortableRandomInnovations, 36 pPlot (HypothesisTesting), 37 print.ciaCountries (MarketStatistics), 1 print.ciaIndicators (MarketStatistics), 1 print.fDISTFIT (DistributionFits), 25 print.keystats (TimeSeriesImport), 4 print.ssd (DistributionFits), 25 pssd (SmoothedSplineDistribution), 23 pstable (StableDistribution), 16 psymstb (StableDistribution), 16 pTable (HypothesisTesting), 37 MarketStatistics, 1 moodTest (TwoSampleTests), 45 msft.dat (ZivotWangData), 14 qansariw (TwoSampleTests), 45 qgh (HyperbolicDistribution), 19 qhyp (HyperbolicDistribution), 19 qjb (OneSampleTests), 39 qnig (HyperbolicDistribution), 19 qqgaussPlot (StylizedFacts), 28 qssd (SmoothedSplineDistribution), 23 qstable (StableDistribution), 16 qsymstb (StableDistribution), 16 qTable (HypothesisTesting), 37 nigFit (DistributionFits), 25 rgh (HyperbolicDistribution), 19 lacfPlot (StylizedFacts), 28 lexrates.dat (ZivotWangData), 14 lillieTest (OneSampleTests), 39 lmacfPlot (StylizedFacts), 28 logpdfPlot (StylizedFacts), 28 52 rhyp (HyperbolicDistribution), 19 rnig (HyperbolicDistribution), 19 rnorm.lcg (PortableRandomInnovations), 36 rowAvgs (BasicStatistics), 32 rowCumsums (BasicStatistics), 32 rowKurtosis (BasicStatistics), 32 rowSkewness (BasicStatistics), 32 rowStats (BasicStatistics), 32 rowStdevs (BasicStatistics), 32 rowVars (BasicStatistics), 32 rssd (SmoothedSplineDistribution), 23 rstable (StableDistribution), 16 rsymstb (StableDistribution), 16 rt.lcg (PortableRandomInnovations), 36 runif.lcg (PortableRandomInnovations), 36 runsTest (OneSampleTests), 39 scalinglawPlot (StylizedFacts), 28 set.lcgseed (PortableRandomInnovations), 36 sfTest (OneSampleTests), 39 shapiroTest (OneSampleTests), 39 shiller.annual (ZivotWangData), 14 shiller.dat (ZivotWangData), 14 show,fHTEST-method (HypothesisTesting), 37 show,fWEBDATA-method (TimeSeriesImport), 4 show.fHTEST (HypothesisTesting), 37 show.fWEBDATA (TimeSeriesImport), 4 singleIndex.dat (ZivotWangData), 14 skewness (BasicStatistics), 32 SmoothedSplineDistribution, 23 spearmanTest (TwoSampleTests), 45 splusLikePlot (PlotFunctions), 8 ssdFit (DistributionFits), 25 StableDistribution, 16 stableFit (DistributionFits), 25 stableMode (StableDistribution), 16 INDEX stableSlider (StableDistribution), 16 stdev (BasicStatistics), 32 StylizedFacts, 28 symstbSlider (StableDistribution), 16 teffectPlot (StylizedFacts), 28 tFit (DistributionFits), 25 TimeSeriesImport, 4 tsPlot (PlotFunctions), 8 tTest (TwoSampleTests), 45 TwoSampleTests, 45 usdchf (fBasicsData), 11 usddem30u (fBasicsData), 11 usdthb (fBasicsData), 11 varex.ts (ZivotWangData), 14 varfTest (TwoSampleTests), 45 wfe1 (fBasicsData), 11 wfe2 (fBasicsData), 11 wfe3 (fBasicsData), 11 wfe4 (fBasicsData), 11 wfe5 (fBasicsData), 11 wfe6 (fBasicsData), 11 xmpBasics (fBasicsTools), 49 xmpfBasics (fBasicsTools), 49 yahooImport (TimeSeriesImport), 4 yhoo.df (ZivotWangData), 14 ZivotWangData, 14