⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 gettickersdata.m

📁 This is a simple algorithm that downloads trading data from yahoo database. It is basically a large
💻 M
字号:
%{
Function for Downloading Trading Data from Yahoo

This is a simple algorithm that downloads trading data from yahoo based on a set of tickers
and user options.

The main advantage of this function is that it controls for bad data or for
the cases where yahoo server doesn't have the data you want. You can also set a
critera for bad data (eg. missing 5% of valid prices, when comparing to a full date ticker).
When bad data is found, the function skips that particular asset.

It is basicly a large scale application of sqq.m which is originally submited by
Michael Boldin, link at FEX description.

The steps of the algorithm are:

1) Get tickers from tickers.txt (you can change this file for you own purpose)
1) Download data using sqq() (controlling for bad data and missing prices)
2) Substitute values of 0 (price=0) for the most closest previous price
(this way you set return on that date = 0. This substitution shouldn't happen
much. You can control the filter for this event by using a reasonable value for variable Critera)

Please notes that SP500 changes composition over time. The composition
for the function is from 01-july-07. 

INPUT:

    firstDay    - Date in string notation. First date in sample (format: dd/mm/yyyy)

    lastDay    - Date in string notation. Last date in sample (format: dd/mm/yyyy)

    n_stocks    - Number of stocks which to get data from sp500 (in the same order as ticker.txt)

    ticker_fileName - Name of the txt file with the ticker (check the file itself for the structure)

    Freq        - (optional,default='d'). Frequency of data. Options: 'd' dailly, 'w' weekly, 'm', monthlly

    Criteria    - (optional, default=.05). The percentage of a full valid price vector (from a valid ticker eg. IBM)
                  that will set how many missing prices is enough to set bad data event (and cut such stock
                  from the downloaded database.

OUTPUT:

    SPData - A structure with the folowning fields:

    Date
    Close
    Open
    High
    Low
    Volume
    Closeadj
    Valid_Tickers - The downloaded valid tickers (those with enough valid prices and no yahoo problems)

    In such matrices, each collum represents each asset

Author: Marcelo Scherer Perlin
Contact:  marceloperlin@gmail.com
Phd Student, Reading University - ICMA/UK

Fell free to use it and/or modify it for your own interest.

%}

function [SPData]=GetTickersData(firstDay,lastDay,n_stocks,ticker_fileName,Freq,P_Criteria)

if nargin()<4
    error('The function needs at least 4 inputs');
end

if nargin()==4
    Freq='d';
    P_Criteria=.05;
end

if nargin()==5
    P_Criteria=.05;
end

if ~any([strcmp(Freq,'d') strcmp(Freq,'w') strcmp(Freq,'m')])
    error('The input Freq should be ''d'' for dailly data, ''w'' for weekly data or ''m'' for montly.')
end

if (exist(ticker_fileName,'file'))==0
    error(['The required file ' ticker_fileName ' was not found at working directory. Please check it.']);
end

if P_Criteria>1
    error('The input Criteria is a percentage and therefore lower than 1');
end

% Conversion of dates (from dd/mm/yyyy to mm/dd/yyyy)

firstDay=datestr(datenum(firstDay,'dd/mm/yyyy'),'mm/dd/yyyy');
lastDay=datestr(datenum(lastDay,'dd/mm/yyyy'),'mm/dd/yyyy');

% Getting ticker from ticker.txt

fid = fopen(ticker_fileName, 'r');
a=textscan(fid, '%s','delimiter',' ');
fclose(fid);

tickers=a{1,1};

fprintf(1,'\nDownloading IBM stock for Date comparisons <-');

% Downloading IBM for date comparinson (from my tests IBM is a reliable
% stock regarding dates)

try
    first_out=sqq_msp('ibm',lastDay,firstDay,Freq);
catch
    error('Problem with downloading the IBM stock for dates comparison. Please check internet conection and try again..');
end

% creting Criteria for bad data. When number of missing prices is higher than Criteria,
% then the stock is removed from database

Criteria=floor(P_Criteria*length(first_out));

fprintf(1,['Download complete:\n First Date in data -> ',datestr(first_out(1,1)),'\n Last  Date in data -> ',datestr(first_out(end,1))])
fprintf(1,['\nCriteria for bad data is: ',num2str(Criteria),' missing prices.\n']);

Date(:,1)=first_out(:,1);

fprintf(1,'\n-> Starting Yahoo Downloads <-');

Invalid_Tickers{1,1}='None';

idxInvalid=0;
idx=0;
missingPriceTable=zeros(1,n_stocks);
StatusCell=cell(1,n_stocks);
missingDates=cell(1,n_stocks);

for i=1:n_stocks
    missingDates{1,i}=cell(1,1);
    missingDates{1,i}{1,1}='None';
end

for i=1:n_stocks
    try
        fprintf(1,['\nAsset #',num2str(i),' - Ticker: ',tickers{i},' --> ']);

        out=sqq_msp(tickers{i},lastDay,firstDay,Freq);

        if size(out,1)==size(first_out(:,1),1) % cases where the dates match

            idx=idx+1;

            Date(:,idx)=out(:,1);
            Close(:,idx)=out(:,2);
            Open(:,idx)=out(:,3);
            Low(:,idx)=out(:,4);
            High(:,idx)=out(:,5);
            Volume(:,idx)=out(:,6);
            Closeadj(:,idx)=out(:,7);

            fprintf(1,'DL OK. Dates fully matching.');
            Valid_Tickers{1,idx}=tickers{i};
            
            StatusCell{1,i}='OK';

        elseif (length(first_out(:,1))-length(out(:,1)))<Criteria % cases where the dates DONT match

            idx=idx+1;

            for j=1:size(out,1) % this loop will set all prices according to matching dates of valid ticker (in this case IBM)

                [a]=find(out(j,1)==Date(:,1));

                Date(a,idx)=out(j,1);
                Close(a,idx)=out(j,2);
                Open(a,idx)=out(j,3);
                Low(a,idx)=out(j,4);
                High(a,idx)=out(j,5);
                Volume(a,idx)=out(j,6);
                Closeadj(a,idx)=out(j,7);

            end
            
            missingPriceTable(1,i)=length(first_out(:,1))-length(out(:,1));

            fprintf(1,['DL OK, but is missing ',num2str(length(first_out(:,1))-length(out(:,7))),' price(s)']);
            Valid_Tickers{1,idx}=tickers{i};
            
            StatusCell{1,i}='OK';
            [datesOut]=findMissingDates(out(:,1),first_out(:,1));
            missingDates{1,i}=datesOut;

        else
            fprintf(1,'DL OK, but number of missing prices is higher than criteria. Skipping this one.');
            
            StatusCell{1,i}='Not OK. Number of missing prices higher than criteria.';
            missingPriceTable(1,i)=length(first_out(:,1))-length(out(:,1));
            [datesOut]=findMissingDates(out(:,1),first_out(:,1));
            missingDates{1,i}=datesOut;
        end

    catch
        fprintf(1,'No Data found in Yahoo, skipping this one');
        
        StatusCell{1,i}='No Data Found in Yahoo.';
        
        idxInvalid=idxInvalid+1;
        Invalid_Tickers{1,idxInvalid}=tickers{i};
        continue
    end
end

% fixing cases where the first price = 0 (this would be a problem in the
% substituion part

[a]=find(Close(1,:)==0);

for j=1:size(a,2)
    for i=1:size(Close,1)
        if Close(i,a(j))~=0

            Close(1,a(j))=Close(i,a(j));
            Closeadj(1,a(j))=Closeadj(i,a(j));
            High(1,a(j))=High(i,a(j));
            Low(1,a(j))=Low(i,a(j));
            Open(i,a(j))=Open(i,a(j));

            break;
        end
    end

end

% Replacing prices = 0 with the oldest availabe price

for i=2:size(Close,1)
    for j=1:size(Close,2)
        if Close(i,j)==0

            Close(i,j)=Close(i-1,j);
            Closeadj(i,j)=Closeadj(i-1,j);
            High(i,j)=High(i-1,j);
            Low(i,j)=Low(i-1,j);
            Open(i,j)=Open(i-1,j);

        end
    end
end

% Conversion of dates (from mm/dd/yyyy to dd/mm/yyyy)

firstDay=datestr(datenum(firstDay,'mm/dd/yyyy'),'dd/mm/yyyy');
lastDay=datestr(datenum(lastDay,'mm/dd/yyyy'),'dd/mm/yyyy');

% Saving it all in the structure

SPData.Close=Close;
SPData.Closeadj=Closeadj;
SPData.Date=Date;
SPData.missingPriceTable=missingPriceTable;
SPData.StatusCell=StatusCell;
SPData.missingDates=missingDates;

for i=1:size(Date,1)
    SPData.DateStr{i,1}=datestr(Date(i,1));
end

SPData.Open=Open;
SPData.Volume=Volume;
SPData.Low=Low;
SPData.High=High;
SPData.Valid_Tickers=Valid_Tickers;

SPData.Invalid_Tickers=Invalid_Tickers;

SPData.Query.firstDay=firstDay;
SPData.Query.lastDay=lastDay;
SPData.Query.n_stocks=n_stocks;
SPData.Query.Freq=Freq;
SPData.Query.P_Criteria=P_Criteria;
SPData.Query.tickers=tickers;
SPData.Query.timeofQuery=datestr(now);

fprintf(1,'\n');

function [datesOut]=findMissingDates(datesAsset,datesBench)

datesOut=cell(1,1);

idx=1;
for i=1:size(datesAsset,1)
    myDate=datesBench(i);
    if sum(myDate==datesAsset)==0
        datesOut{idx,1}=datestr(myDate,'dd/mm/yyyy');
        idx=idx+1;
    end
end  

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -