Top Banner
A big data analytics based methodology for strategic decision making Murat Ozemre BIMAR Information Technology Services, Izmir, Turkey, and Ozgur Kabadurmus Department of International Logistics Management, Yasar University, Izmir, Turkey Abstract Purpose The purpose of this paper is to present a novel framework for strategic decision making using Big Data Analytics (BDA) methodology. Design/methodology/approach In this study, two different machine learning algorithms, Random Forest (RF) and Artificial Neural Networks (ANN) are employed to forecast export volumes using an extensive amount of open trade data. The forecasted values are included in the Boston Consulting Group (BCG) Matrix to conduct strategic market analysis. Findings The proposed methodology is validated using a hypothetical case study of a Chinese company exporting refrigerators and freezers. The results show that the proposed methodology makes accurate trade forecasts and helps to conduct strategic market analysis effectively. Also, the RF performs better than the ANN in terms of forecast accuracy. Research limitations/implications This study presents only one case study to test the proposed methodology. In future studies, the validity of the proposed method can be further generalized in different product groups and countries. Practical implications In todays highly competitive business environment, an effective strategic market analysis requires importers or exporters to make better predictions and strategic decisions. Using the proposed BDA based methodology, companies can effectively identify new business opportunities and adjust their strategic decisions accordingly. Originality/value This is the first study to present a holistic methodology for strategic market analysis using BDA. The proposed methodology accurately forecasts international trade volumes and facilitates the strategic decision-making process by providing future insights into global markets. Keywords Big data analytics, Strategic decision making, Trade volume forecasting, Machine learning Paper type Research paper 1. Introduction Todays competitive business environment requires companies to make better predictions for their complex business environments and react to changing conditions immediately. To deal with this complexity, companies started to include more data sources (internal/external) in their decision-making processes. Finding the right open/public data sources, making regular updates, and integrating them with the internal data are considered as some of the obstacles for businesses. Companies that overcome these big data obstacles and smooth their decision- making processes would get a competitive advantage in their market. Although discussions about the definition of big data still continue (Hu et al., 2014), the main characteristics of big data are defined by five Vs: Volume, Velocity, Variety, Veracity and Value (Nguyen et al., 2018). Using the right data and the ability to derive meaningful results have become important for business decision-makers. The analysis of the large volumes of data is called Big Data Analytics (BDA) (Barbosa et al., 2018a). For achieving real value from BDA, a variety of techniques and disciplines, including statistics, data mining, machine learning, social network analysis, signal processing, pattern recognition, optimization methods and visualization approaches, can be used (Chen and Zhang, 2014). Higher operational efficiency, better strategic decision-making, better visibility, A BDA methodology for strategic decisions 1467 The current issue and full text archive of this journal is available on Emerald Insight at: https://www.emerald.com/insight/1741-0398.htm Received 8 August 2019 Revised 29 October 2019 15 January 2020 17 January 2020 Accepted 4 February 2020 Journal of Enterprise Information Management Vol. 33 No. 6, 2020 pp. 1467-1490 © Emerald Publishing Limited 1741-0398 DOI 10.1108/JEIM-08-2019-0222
24

A big data analytics based methodology for strategic ...

Jun 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A big data analytics based methodology for strategic ...

A big data analytics basedmethodology for strategic

decision makingMurat €Ozemre

BIMAR Information Technology Services, Izmir, Turkey, and

Ozgur KabadurmusDepartment of International Logistics Management, Yasar University, Izmir, Turkey

Abstract

Purpose – The purpose of this paper is to present a novel framework for strategic decision making using BigData Analytics (BDA) methodology.Design/methodology/approach – In this study, two different machine learning algorithms, Random Forest(RF) andArtificial Neural Networks (ANN) are employed to forecast export volumes using an extensive amountof open trade data. The forecasted values are included in the Boston ConsultingGroup (BCG)Matrix to conductstrategic market analysis.Findings – The proposed methodology is validated using a hypothetical case study of a Chinese companyexporting refrigerators and freezers. The results show that the proposed methodology makes accurate tradeforecasts and helps to conduct strategicmarket analysis effectively. Also, the RF performs better than theANNin terms of forecast accuracy.Research limitations/implications – This study presents only one case study to test the proposedmethodology. In future studies, the validity of the proposed method can be further generalized in differentproduct groups and countries.Practical implications – In today’s highly competitive business environment, an effective strategic marketanalysis requires importers or exporters tomake better predictions and strategic decisions. Using the proposedBDA based methodology, companies can effectively identify new business opportunities and adjust theirstrategic decisions accordingly.Originality/value – This is the first study to present a holistic methodology for strategic market analysisusing BDA. The proposed methodology accurately forecasts international trade volumes and facilitates thestrategic decision-making process by providing future insights into global markets.

Keywords Big data analytics, Strategic decision making, Trade volume forecasting, Machine learning

Paper type Research paper

1. IntroductionToday’s competitive business environment requires companies tomake better predictions fortheir complex business environments and react to changing conditions immediately. To dealwith this complexity, companies started to include more data sources (internal/external) intheir decision-making processes. Finding the right open/public data sources, making regularupdates, and integrating them with the internal data are considered as some of the obstaclesfor businesses. Companies that overcome these big data obstacles and smooth their decision-making processes would get a competitive advantage in their market.

Although discussions about the definition of big data still continue (Hu et al., 2014), themain characteristics of big data are defined by five Vs: Volume, Velocity, Variety, Veracityand Value (Nguyen et al., 2018). Using the right data and the ability to derive meaningfulresults have become important for business decision-makers. The analysis of the largevolumes of data is called Big Data Analytics (BDA) (Barbosa et al., 2018a). For achieving realvalue from BDA, a variety of techniques and disciplines, including statistics, data mining,machine learning, social network analysis, signal processing, pattern recognition,optimization methods and visualization approaches, can be used (Chen and Zhang, 2014).Higher operational efficiency, better strategic decision-making, better visibility,

A BDAmethodologyfor strategic

decisions

1467

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/1741-0398.htm

Received 8 August 2019Revised 29 October 2019

15 January 202017 January 2020

Accepted 4 February 2020

Journal of Enterprise InformationManagement

Vol. 33 No. 6, 2020pp. 1467-1490

© Emerald Publishing Limited1741-0398

DOI 10.1108/JEIM-08-2019-0222

Page 2: A big data analytics based methodology for strategic ...

improvement in customer service, and better new products (or services) are the mostcommonly agreed big data opportunities (Cao and Duan, 2017; Chen and Zhang, 2014).Although BDA can help reducing costs, being more agile and achieving higher service levels(Nguyen et al., 2018), less than 20% of companies adapted BDA in their supply chains due topeople, culture or process-related challenges (Chen and Zhang, 2014; Sivarajah et al., 2017;Verma and Bhattacharyya, 2017).

With the increased availability of wide data sources related to international trade,exporters and importers can assess the progress of the trade volumes between countries andmake their strategic decisions accordingly. Nevertheless, they usually try to reach somestatistics about a specific market country, period and product, and therefore, obtain verylimited information about market behavior. With the help of a holistic approach, they couldcontinuously evaluate the market situation from different perspectives, and then they coulduse this market assessment in their decision-making process by simultaneously consideringthemarket, product and time-related factors. All these three aspects can be combinedwith theavailable wide data sources using BDA to have a clearer vision about the specific country andproduct pairs that they should focus on in the future.

The main objective of this study is to develop a holistic methodology for strategic marketanalysis. In our proposed methodology, various international trade data are considered byusing BDA and managerial implications are devised for a company. The Boston ConsultingGroup (BCG)Matrix is adopted as the strategic management tool in this study. Our study alsoprovides a nonparametric forecasting method using machine learning (ML) algorithms withan extensive amount of data. Note thatML algorithms have become essential parts of real-lifeBDA applications (Saggi and Jain, 2018; Zhou et al., 2017). In this study, two different MLalgorithms, Random Forest (RF) and Artificial Neural Networks (ANN) have been applied toforecast the export volumes. The previous studies in the literature only developed forecastingmodels, but we used forecasting as an input to our proposed strategic market model anddeveloped a holistic Big Data Analytics framework for strategic market analysis. Also, ourproposed BDA approach employs more variety (and amount) of data and machine learningfeatures to forecast export volumes than the other models in the literature. To demonstratethe effectiveness of our methodology, a hypothetical case study of a Chinese companyexporting refrigerators and freezers to the United Kingdom is presented. This product groupis one of the main export products of China, and the United Kingdom is one of the mainimporters. To validate the effectiveness of our methodology more thoroughly, three differentsub-products are tested, and the results are discussed.

The rest of this paper is organized as follows. Section 2 summarizes the literature on theforecasting of trade volumes and related studies. In Section 3, the research goals and theproblem definition are given in detail. The proposed holistic BDA framework, whichcombines the strategic market analysis model and forecasting international trade, ispresented in Section 4. Section 5 demonstrates the application of the proposed methodologyand reports the results of the case study in detail. Conclusion and futurework are summarizedin Section 6.

2. Literature reviewDemand management is one of the key processes of Supply Chain Management (SCM)(Barbosa et al., 2018b) and BDA enables companies to understand their market demandsbetter (Sanders, 2014; Tiwari et al., 2018). A proper understanding of the market demandshelps designing supply chains to respond faster and more effectively to changing customerand supplier needs (Sanders, 2014). According to the latest literature review studies, thenumber of publications on BDA has increased dramatically for the last five years (Sivarajahet al., 2017) in parallel with the increase in supply chain management studies (Mishra et al.,

JEIM33,6

1468

Page 3: A big data analytics based methodology for strategic ...

2018; Nguyen et al., 2018). Demand management by using BDA covers 25% of all BDA in theSCM studies (Barbosa et al., 2018b). These studiesmainly focus on forecasting, synchronizingand predicting customer demands.

According to Hu and Zhang (2018) and Akter et al. (2019), BDA can be applied to variousinterdisciplinary fields. For example, BDA has been effectively used in medical data analysis(Wang et al., 2017), financial market prediction (Yang et al., 2020), and customer churnprediction (Shirazi and Mohammadi, 2019). In addition to the availability of data, theadvancement of computational power and ease of access to BDA technologies (Yaqoob et al.,2016) have also helped companies to automate their decision-making process using ArtificialIntelligence methods (Duan et al., 2019). Therefore, the investments in Business Intelligence(BI) technologies significantly improve the business performance of companies (Surbaktiet al., 2019) through prescriptive and predictive analysis using BDA. However, currentstudies (Akter et al., 2019) revealed that these investments could improve companyperformance only if they are integrated into the decision-making processes. Therefore, in thisstudy, we focus on using BDA for forecasting in international trade and conducting strategicmarket analysis as forecasting is one of the key areas that BDA can help companies to makebetter decisions (Hazen et al., 2018).

Over the past years, various forecasting models were applied for trade forecasting thatuses extrapolation, time-series and economic models, agent-based computational economicsmodels, and machine learning algorithms (Nummelin and H€anninen, 2016). Two mainstreamresearch approaches, parametric and nonparametric approaches, have been used in tradeforecasting models. AutoRegressive Integrated Moving Average (ARIMA), Holt–Winter andtheir variations are the most commonly used parametric time series forecasting techniques intrade volume forecasting. Dale and Bailey (1982) used the Box–Jenkinsmethod to forecast USmerchandise exports. Senhadji and Montenegro (1999) estimated export demand elasticitiesin developing and industrial countries by time-series techniques. Veenstra and Haralambides(2001) forecasted seaborne trade flows of four different commodity products (crude oil, ironore, grain and coal) by using Vector Auto Regression (VAR). Akhtar (2003) investigatedseasonality in Pakistan’s merchandise exports and imports by using a univariate ARIMAmodel. Sahu andMishra (2013) usedARIMA to estimate spice import and export volumes andproduction behaviors for India and China. Khan (2011) used ARIMA, Holt–Winter and VARtechniques to forecast Bangladesh’s total imports. Similarly, Kargbo (2007) used ARIMA,VAR, Engle–Granger Single-Equation and Vector Error-Correction models to forecast SouthAfrica’s agricultural exports and imports. Emang et al. (2010) built univariate time seriesmodels to forecast the export demand for molding and chipboard volume for PeninsularMalaysia.

The needs for managing structural interdependencies and parametric assumptions makethe usage of parametric models harder in realistic applications. In addition to the difficultiesof applying parametric models, the availability of machine learning tools and open datasources are the other reasons to use nonparametric models. Although there are many studieson forecasting macro-economic indicators (Hassani and Silva, 2015), there are a limitednumber of studies on BDA to forecast trade volumes. Among them, Co and Boosarawongse(2007) compared their Artificial Neural Network model with the parametric models(exponential smoothing) to forecast Thailand’s rice export. Their study showed that ANNyields better forecast results. Similarly, Pakravan et al. (2011) forecasted Iran’s rice import byusing ANN. Pannakkong et al. (2016) showed that ANN performs better than ARIMA toforecast Thailand’s cassava export volume. Silva and Hassani (2015) used Singular SpectrumAnalysis to forecast US trade before, during and after the recession of 2008. Nummelin andH€anninen (2016) applied Support Vector Machines (SVM), ANN and RF models to forecastglobal bilateral trade flows of soft sawn wood. Similar to our model, their model uses not onlyexport volumes but also other economic indicators, such as exchange rates and Gross

A BDAmethodologyfor strategic

decisions

1469

Page 4: A big data analytics based methodology for strategic ...

Domestic Product (GDP). However, their model includes a limited number of factors, whereasour model employs a broad range of factors. Alam (2019) forecasted total annual exports andimports of the Kingdom of Saudi Arabia using ANN and ARIMA and found that ANN findsbetter results. Devyatkin et al. (2018) predicted the export gain of food products for theRussian Federation using ANN. Behrens (2019) used RF to forecast German export andimport growth. Kuo and Li (2016) forecasted Taiwanese export trade using Support VectorRegression. Before forecasting, their model first normalizes the data usingwavelet transform,and clusters the data with similar features using a firefly algorithm-based k-means algorithm.Shibasaki andWatanabe (2012) forecasted cargo flow in Asia–Pacific Economic Cooperationregion by using relations between economic growth, trade, and logistics demand models.Sokolov-Mladenovi�c et al. (2016)modeled economic growth usingANNwith trade, import andexport parameters. Gupta and Kashyap (2015) forecasted inflations of G-7 countries usingANN. In our model, we used economic growth and inflation to predict import and exportvolumes. Kaastra and Boyd (1996) built an ANN model to forecast financial and economictime series. Ertugrul and Tagluk (2018) also usedANN to forecast financial indicators using ageneralized behavioral learning method. Cui et al. (2019) analyzed the impact of a free tradeagreement among China, Japan, and South Korea using BDA.

In addition to their usages in trade and economic models, the ANN algorithms are widelyused in different forecasting fields (Palmer et al., 2006). Ayankoya et al. (2016) built a neuralnetwork model to predict grain commodity prices in South Africa. Doganis et al. (2006) usedANN and evolutionary computing for sales forecasting of short shelf-life food products.Sagaert et al. (2018) used BDA to improve the accuracy of tactical sales predictions for amajorsupplier of the tire industry. Macroeconomic factors are incorporated by Sagaert et al. (2019)in demand forecasting to improve tactical capacity planning. Palmer et al. (2006) forecastedtourism time series using ANN. Laptev et al. (2017) studied time-series modeling based onLong Short Term Memory (LSTM) to forecast extreme events for Uber.

Our study contributes to the literature by presenting a holistic Big Data Analyticsframework for strategic market analysis. The previous studies in the literature only focusedon developing forecasting methods to predict international trade volume; however, our studyis the first to use BDA for strategic market analysis. Also, our BDA based forecasting modeldiffers from the previous nonparametric forecasting studies since it employs more varietyand quantity of data sources. In addition, our proposedmethodology extends the BCGmatrixby adding the predictions of future trends and market situations to help companies to makebetter strategic market decisions.

3. Problem definitionFor evaluation of the current and future business situations in international trade, threedifferent aspects are considered. The first aspect is the representation of the market. A singlecountry, a geographical region (North Africa, Middle East, etc.) or a group of countries withcommon perspectives (OECD, EFTA, etc.), can be considered as a market. The second one isthe prediction horizon, which can vary from a month to several years. The third aspect is theproduct granularity. Product granularity is handled according to the HarmonizedCommodity Description and Coding System (HS) levels. HS is an internationalnomenclature for the classification of products (UN Trade Statistics, 2017). The productsare represented as six-digit codes in the HS. The first two digits of an HS code represents thechapter of the goods (e.g., “84” refers to “Machinery”). The group within that chapter isrepresented by the next two digits. For example, the “Refrigerator and freezers” group in themachinery chapter is coded by “84.18.” The last two digits identify a specific product withinthat group. For example, “84.18.40” represents the upright type freezers with a maximumcapacity of 900 L.

JEIM33,6

1470

Page 5: A big data analytics based methodology for strategic ...

According to China’s export volumes, Chapter 84 (Machinery) is one of the mostcompetitive export chapters, which is within the top five export chapters to the UnitedKingdom. For our case study, the product group of refrigerators and freezers (HS5 84.18) isselected because it is closely related to the end-customer behavior. For being more productspecific, three different product types are selected from the top five products in HS 5 84.18group. Export volume changes of this product group between the years 2013 and 2017 aretaken into account in the selection process. The average annual growth rate of HS 5 84.18group is 5%. For capturing the different effects of various growth rates, these products areselected as follows. Product 84.18.10 is almost the same as (8%), Product 84.18.40 is above(13%), and Product 84.18.50 is below (�5%) the average growth rate of 84.18 group. Thesethree product types are defined as (UN Trade Statistics, 2017):

(1) 84.18.10: Refrigerators and freezers; combined refrigerator-freezers, fitted withseparate external doors, electric or other.

(2) 84.18.40: Freezers; of the upright type, not exceeding 900 L capacity.

(3) 84.18.50: Furniture incorporating refrigerating or freezing equipment; for storage anddisplay.

Our study addresses three essential research questions:

(1) How can we use open data sources in strategic market analysis for exporters/importers?

(2) Which strategicmanagementmodels and open data can be used together for strategicmarket analysis?

(3) How can we create additional value by using Big Data Analytics in the selectedstrategic market analysis method?

These are the main research goals that we aim to solve by using our proposed BDAframework. For testing our methodology, we present a hypothetical case study in which aChinese company exports refrigerators and freezers all over the world, where one of its mainmarkets is the United Kingdom. According to OECD trade statistics, China is the largestexporter of the refrigerators in the world, with a total export value of 10.1 billion dollars in2018, which makes up 22.2% of total refrigerator exports. Although the case company is notreal, it realistically portrays an export scenario where the company aims to analyze the futuretrends in the global market and its relative market share. As our methodology uses externalopen data sources and does not require internal data from the company, our case studyeffectively demonstrates the usage of our proposed strategic market analysis methodologybased on BDA. We only need to assume an annual export value for the hypothetical casecompany to calculate its relativemarket share in the BCGmatrix for demonstration purposes.Without loss of generality, any arbitrary export value can be chosen, as we show in Section 5.Therefore, the proposed methodology can be easily applied to a real company as long as thecompany’s annual export data can be obtained.

4. MethodologyOur proposed Big Data Analytics methodology is based on CRISP-DM (CRoss IndustryStandard Process for Data Mining) (Wirth and Hipp, 2000). By using the steps of CRISP-DM(business understanding, data understanding, data preparation, modeling and evaluation) asstages of our framework, we added the required steps to develop a holistic methodology formarket analysis usingBDA.As shown in Figure 1, our proposedmethodology is conducted intwo different levels in parallel: (1) Strategic Market Analysis (SMA), and (2) Forecasting

A BDAmethodologyfor strategic

decisions

1471

Page 6: A big data analytics based methodology for strategic ...

International Trade (FIT). These two levels and their steps are explained in detail in thefollowing sections.

4.1 Strategic market analysis modelExporters, importers and logistics service providers are considered as the potential users ofour methodology. These companies are exporting, importing or facilitating movements ofproducts from one country to another. Their portfolios consist of product and country pairs.Successful portfolio management is important to be one step ahead in today’s competitivebusiness environment (SMA1). International bilateral trade data between countries arepublicly available and easily accessible by the companies. However, the key issue for thecompanies is to find a proper way of using these open international trade data in theirdecision-making processes (SMA2). International trade data are required for our case studyin twoways. First, trade data give us the importmarket size of a country for a specific productat a certain time. Therefore, the market growth rate can be calculated using the data. Second,by combining themarket size with the sales of the company, a company can derive its relativemarket share (SMA3).

Among various strategic management models in market analysis, Boston ConsultingGroup (BCG) Matrix (Kotler and Armstrong, 2018) is the most suitable model for our studybecause it uses market growth rate and relative market share. The BCG Matrix allowsmanaging the product portfolio of a company by showing opportunities in the market.Market growth rate and relative market share are the dimensions of the BCG Matrix. Basedon these two dimensions, product groups are classified into four quadrants: Stars, Cash Cows,QuestionMarks andDogs (SMA4). In ourmethodology, we extend the BCGmatrix by addingforecast information about the market (FIT1). The market forecasts of the products aredepicted with the grey spots assuming that the company sales volumes remain the same asthe previous period. In our study, the forecast horizon is chosen as three months (SMA5,

Figure 1.The proposed Big DataAnalytics basedmethodology forstrategic marketanalysis

JEIM33,6

1472

Page 7: A big data analytics based methodology for strategic ...

FIT6). In Figure 2, an example of BCGMatrix is presented for a hypothetical company whichexports three types of products to several countries. Colored spots show the past data wherethese spot sizes are proportional to the sales volumes of companies.

In the BCG Matrix, Stars are high market share and high market growth rate products.Therefore, companies are advised to invest in stars. In the hypothetical example given inFigure 2, the export products of 84.18.40 to Sweden, the USA and Germany, and 84.18.10 toItaly are in the Stars quadrant. According to the forecasts (depicted as their grey shades), theywould stay in the same quadrant in the near future (i.e., the next 3–6 months). Therefore,investment in those markets is crucial. High market share and low market growth rate aredefined as Cash Cows. Some investment is recommended for cash cows to maintain a certainlevel of cash flow. This action could be applied to the export products of 84.18.50 to Germany,Netherlands and the UK, and 84.18.10 to the USA Note that the export product of 84.18.10 toGermany has the potential to move from cash cows to stars, which may require moreinvestment like a star than a regular cash cow. Dogs are located in the low market share andlow market growth rate quadrant. These product groups are prime candidates fordownsizing (or even exiting the market). In Figure 2, the export products of 84.18.50 to SaudiArabia and 84.18.40 to theUK are in this quadrant. The export product of 84.18.10 to France isforecasted to move from Dogs to Cash Cows. Therefore, the market exit plan can be

Figure 2.A sample BCG matrix

for a hypotheticalcompany (Colored

spots show the currentsituation and the grey

spots show theforecasted situation)

A BDAmethodologyfor strategic

decisions

1473

Page 8: A big data analytics based methodology for strategic ...

postponed until its relative market share decreases. Low market share and high marketgrowth rate define Question Marks. The BCG matrix recommends investing in “questionmarks” if the product has the potential tomove into stars (or divesting, otherwise). The exportproducts of 84.18.10 to the UK and 84.18.50 to Iraq are question marks. The export product of84.18.40 to France has the potential for investment because of its movement to Starsquadrant. Note that, two percent growth rate of the market is taken as the separating valuebetween Stars and Cash Cows in the BCG Matrix. For relative market share, four percent istaken as the threshold value between Dogs and Cash Cows.

Entering new markets to enrich the portfolio is another strategic decision that helps toincrease the number of question marks in the matrix. As demonstrated by the examplepresented in Figure 2, the BCGMatrix covers all product types and countries, which makes iteasy to identify the current market conditions of the newly entered markets. In today’srapidly changing business environment and with the constant flow of big data, productportfolios can be evaluated more frequently, even monthly, to identify current trendspromptly and act accordingly (SMA6). Our proposed methodology helps to identify futuretrends and market situations by adding forecasted information to the BCG matrix.

4.2 Forecasting international trade by using big data analyticsTo accurately forecast international trade, the dynamics and potential factors affectingbilateral trade between countries must be identified first (FIT1). This is the business-understanding step of the trade forecasting. These factors can be grouped into two maingroups: (1) Product-specific trade information, and (2) Country or global conditions relatedfactors. The main components affecting bilateral trade are the supply and demand factors ofthe related countries (Ayankoya et al., 2016; Nummelin andH€anninen, 2016). Formodeling thedemand for a specific product in a target country (in our case study, the United Kingdom), theexport volumes of the top five exporters to the target country are considered. For modelingthe supply from the source country (in our case study, China), the trade volumes of the top fivetarget countries that the source country exports are taken into account. As the last product-specific factor, the unit value of the traded product is included in the model. The country orglobal factors are divided into the subgroups of the political environment, businessenvironment (Bovi and Cerqueti, 2016), economic environment (Keck et al., 2010; Sokolov-Mladenovi�c et al., 2016) and trade environment-related factors. The business environment isincluded by adding Business Confidence and Consumer Confidence Indicators. With theinclusion of the Economic Political Uncertainty index, political factors are covered. Economicfactors are GDP, Exchange Rates, Composite Leading Indicators, Consumer and ProducerPrice Indices. World Trade Volume and World Economic Political Uncertainty indices areincluded as global trade parameters. Table 1 summarizes all these factors used in theforecasting model (FIT2). For obtaining these data, four different open data sources are used,as reported in the subgroup column of Table 1. The trade model notations used in our studyare based on the commonly accepted world trade model notations, and they are listed asfollows:

Xtijk ¼ Trade flow volume from source ðexportingÞ country i and target ðimportingÞ country

j for product k in period t

i0 ¼ Source country ðChinaÞ

j0 ¼ Target country ðUnited KingdomÞ

JEIM33,6

1474

Page 9: A big data analytics based methodology for strategic ...

K ¼ Products; whereK ∈ f84:18:10; 84:18:40; 84:18:50g

Ij0k ¼ The top five source countries for j0 for product k ∀K

Ji0k ¼ The top five target countries for i0 for product k ∀ K

Before applying Big Data Analytics methods, the data must be prepared. Data sets fromdifferent sources are needed to be formatted into a single format. In addition, multiple entries

Maingroup Subgroup Description of the feature

The similar studies used thesame feature

ProductSpecific

Trade Information(Source: InternationalTrade Center)

Supply Capacity of the SourceCountry for Product k:Xti0 jk

;∀ j∈ Ji0k; k∈KTotal Supply Capacity of the SourceCountry for Product k: Xt

i0k; ∀ k∈K

Demand Capacity of the TargetCountry for Product k:Xtij0k

;∀ i∈ Ij0k; k∈KUnit Value of the Product fromSource Country to Target Country:XUVt

i0 j0k;∀ k∈K

Predicting grain price(Ayankoya et al., 2016)Predicting bilateral trade(Nummelin and H€anninen,2016)Predicting grain price(Ayankoya et al., 2016)

Global PoliticalEnvironment(Source: EconomicPolitic Uncertainty)

Economic Policy Uncertainties ofSource and Target Countries, and theWorld:EPUt

i ;∀ i∈ fi0; j0;WorldgBusinessEnvironment(Source: OECD)

Business Confidence Indicators ofSource and Target Countries:BCIti ;∀ i∈ fi0; j0gConsumer Confidence Indicators ofSource and Target Countries:CCI ti ; ∀ i∈ fi0; j0g

Predicting macroeconomicfundamentals (Bovi andCerqueti, 2016)

EconomicEnvironment(Source: OECD)

Composite Leading Indicators ofSource and Target Countries:CLI ti ; ∀ i∈ fi0; j0gGDPs of Source and TargetCountries: GDPt

i ; ∀ i∈ fi0; j0gPredicting bilateral trade(Nummelin and H€anninen,2016), Predicting GDP(Sokolov-Mladenovi�c et al.,2016), Predicting importvolumes (Keck et al., 2010)

Producer Price Indices of Source andTarget Countries: PPI ti ; ∀ i∈ fi0; j0gConsumer Price Indices of Source andTarget Countries: CPI ti ; ∀ i∈ fi0; j0g

Predicting import volumes(Keck et al., 2010)

Exchange Rate of Local Currencies ofSource and Target Countries to USD:EXCt

i ; ∀ i∈ fi0; j0g

As currency exchange rates(Ayankoya et al., 2016),Predicting bilateral trade(Nummelin and H€anninen,2016)

Trade Environment(Source: CPB WorldTrade Monitor)

Total World Trade Volume:World Tradet

Table 1.Features (factors) used

in theforecasting model

A BDAmethodologyfor strategic

decisions

1475

Page 10: A big data analytics based methodology for strategic ...

are cleaned, and data are trimmed to exclude missing values. Time windowing is applied todata points, i.e., the values are shifted three months back. Data labels are standardized foreasy understanding. For example, CHN_UK_84.18.10–3 stands for the export volume ofChina to England for 84.18.10 product, where the values are shifted three monthsback (FIT3).

Decision trees, neural networks, support vector machines, or association rule analysis aresome of the algorithms in order to derive knowledge from data reflecting conditions,processes and patterns (Stahlbock and Voß, 2010). Our data set contains monthly inputs andcan be classified as a multivariate time series. Because of their learning ability from complexrelationships from thesemultivariate data (Mishra et al., 2016), RandomForests andArtificialNeural Networks algorithms are selected as forecasting methods of our study (FIT4).

4.3 Random forest (RF)Random forest is a machine learningmethod, which is widely used for both classification andregression. The RF applications are reported to yield successful results for small datasetswith a relatively large number of features (Gr€omping, 2009). Breiman (2001) developed thisensemble learning approach by using different decision trees. Prediction by using a singledecision tree can cause an overfitting problem in the training set, which yields low-qualityresults in the test set. For avoiding this overfitting due to the single dimension of randomnessin a single decision tree, the RF algorithm trains each tree with a random subset of thecomplete data set. With the inclusion of this second dimension of randomness by usingrandom subsets, the random forest algorithm has the ability to reach high stability androbustness. This property allows using the best features among a random subset ofcandidate features. Therefore, the RF algorithm is used not only in forecasting but also in thefeature selection processes. In the random forest algorithm, the best descriptive combinationis selected from a large number of trees created during the algorithm (Breiman, 2001). Thegeneral process of RF algorithms is explained in Figure 3.

4.4 Artificial neural networks (ANN)Artificial Neural Networks are based on the idea to mimic biological neural systems(Nummelin and H€anninen, 2016; Zhang et al., 1998) by simple and connected processingnodes. ANN has been applied in various areas for the last two decades, including forecasting,credit scoring, financial analysis, and fraud analysis (Chen and Zhang, 2014; Tk�a�c andVerner, 2016). ANN includes computational structures that are designed to mimic thebiological central nervous system (Nummelin and H€anninen, 2016; Zhang et al., 1998). ANN isbased on the accumulation of knowledge during training sessions. ANN is a valuable tool forpattern recognition, classification and forecasting. In this study, Multi-layer Perceptron(MLP) with the feedforward type is applied, because it is one of the most widely usedtopologies in forecasting, and it has low resource requirements. MLP topology consists ofthree layers: the input layer, hidden layer and output layer. The input layer transfers rawinput data to the network. The number of nodes in the input layer depends on the number offeatures used in themodel. The second layer in the network, hidden layer, consists of multiplelayers andmany nodeswithin them.After the hidden layer, the final solution is constructed inthe output layer. Figure 4 presents an MLP topology with three inputs, hidden layers (withtwo layers) and a single output. Note that each node in MLP is fed from the nodes of theprevious layers, and they are fully connected.

Both RF and ANN algorithms depend on various parameters. Some parameters are setbefore the training process, and they are named as hyperparameters. For RF, maximumfeatures, minimum samples leaf, and maximum leaf nodes are some of the hyperparametersused during the tuning process. Solver type, activation function type and the maximum

JEIM33,6

1476

Page 11: A big data analytics based methodology for strategic ...

number of iterations are some hyperparameters used for the tuning process for the ANN. Forachieving more robust and successful results, a convenient set of hyperparameter valuesshould be found by applying a tuning process instead of using the default parameter valuesof the algorithms. In the tuning process, highR2 scores and robustness are aimed (FIT5). Thetuning process is demonstrated in detail in the next section. The market forecast andcompany sales forecast for the next three months are the required information to calculategrey spots in the BCGMatrix (FIT6). The forecasted demands of products are found by usingthe Random Forest and Artificial Neural Network algorithms and then used as market dataforecast. The sales forecasting of the company is simply assumed to be the same as theprevious period to demonstrate the status of the portfolio without taking any actionaccording to changing market conditions.

Figure 3.The general procedureof the Random Forest

algorithm

Figure 4.A Multi-layer

perceptron topologywith three inputs, twohidden layers and a

single output

A BDAmethodologyfor strategic

decisions

1477

Page 12: A big data analytics based methodology for strategic ...

The results of the case study are presented in the following section, including a detailedpreliminary analysis of data preparation and the process of hyperparameter tuning.

5. ResultsAs explained in detail in Section 3, our hypothetical case study company is assumed to exportthree types of products from China to several countries. To demonstrate our methodology,only the export volumes to the United Kingdom are forecasted. The data set used in theforecastingmodel is combined from different open data sources that are given in Table 1. Theresulting data set is from 2006 April to 2018 March with 144 monthly data points.

Before applying the tuning process, three preliminary analyses are conducted to test theeffects of the input month combinations, feature selection and dependent variabletransformation decisions on the forecast quality. The first analysis is to decide on whichmonth or month combinations of the past trade data should be taken as input factors in theforecasting model. The past trade data of three months, three and four months, and threethrough five months are tested. Second, the percentage of selected features is tested with50%, 75% and 100%, where 50% means that only half of the features are included in themodel. The third one is to determine the usage of the dependent variable. In this analysis,three options are searched: (1) same (no transformation), (2) logarithm transformation and (3)square root transformation. According to the results of the preliminary analyses (a total of 27test combinations and ten random number seed for each combination), the best combinationsare found, as shown in Table 2.

There are 28 features in both RF and ANN models. Since each feature in our data set hasdifferent ranges, the data of each feature are normalized according tomin-max normalization.After normalization, data are split into training and test sets, where the training set is 80% ofthe entire data set with 115 observations, and the test set is the remaining 20% with 28observations. During the split process, data stratification is applied to ensure the data of eachmonth to be included in the train and test sets. Therefore, the accumulation of certain monthsin the test or training sets is prevented. Then, the hyperparameters are tuned for bothalgorithms according to the developed procedure summarized in Figure 5.

Both ML methods were implemented by using “scikit-learn” open source libraries ofPython on a Windows PC. In addition, Python libraries of “MLPRegressor” and“RandomForestRegressor” were used for ANN and RF implementations, respectively.

Upon completing the tuning of the Random Forest algorithm for each product, thehyperparameter values are obtained, as in Table 3. Similarly, the tuned hyperparameters ofthe Artificial Neural Network model is presented, as in Table 4. Note that applying the tuningprocess and using the appropriate parameters are very important to achieve good qualityresults since the hyperparameter values are slightly different for each product in bothmodels.

For demonstrating the efficiency of the tuning process, Figure 6 compares theR2 values ofthe Random Forest model between tuned hyperparameter values and default (not tuned)hyperparameter values for all products. As seen from the boxplots of ten different training

ProductHS code

Included past trade data(Months) (�3, �3 and �4, �3,

�4 and �5)

Transformation of the dependentvariable (no transformation, log,

sqrt)

Percentage of selectedfeatures (50%, 75%,

100%)

84.18.10 �3a log 5084.18.40 �3, �4 and �5 log 5084.18.50 �3 and �4 log 75

Note(s): a “�3” refers to the past trade volume of three months ago

Table 2.The best inputcombination of eachproduct according tothe preliminaryanalyses

JEIM33,6

1478

Page 13: A big data analytics based methodology for strategic ...

and test sets with different random number seeds, the tuning process yields significantlyhigher R2 values and more robust results. Note that Figure 6 shows only the results of the RFmodel; however, the results of the ANN model are similar.

Figure 7 compares the Random Forest and Artificial Neural Network models in terms ofmedian R2 values. Both the RF and ANN models are compared with ten random numberseeds. For demonstrating the robustness of the tuning process, the results of the same tuningtraining/test split (with ten random seeds) and different random train/test splits (10 differentsplits) are compared for eachmodel. As seen from the similar results of the same and differenttrain/test data sets in Figure 7, the tuning yields robust results for both models and allproduct types.

According to Figure 7, the RF algorithm yields higher R2 values than the ANN algorithmfor all product types. Among three product types, the highest R2 value (0.923 with RF) isobserved in Product 84.18.50, and the lowest R2 value (0.298 with ANN) is observed in

ProductHS code

Maximumfeatures

Minimumsample leaf

Maximumleaf nodes

Minimumweight

fraction forleaf

Minimumimpuritydecrease

Number ofestimators

84.18.10 Auto 1 200 0.0001 0.00001 20084.18.40 Auto 2 300 0.01 0.000001 10084.18.50 Auto 2 100 0.001 0.000001 100

Product HScode

Solvertype

Activationfunction Alpha

Maximum number ofiterations

Hidden layersize

84.18.10 lbfgs logistic 0.0000001 1,000 (30, 100, 30)84.18.40 lbfgs relu 0.0000001 100,000 (30, 3,100, 30)84.18.50 lbfgs identity 0.001 50,000 (10, 10)

Figure 5.The developed tuningprocess for RandomForest and Artificial

Neural Networkalgorithms

Table 3.Tuned values for

hyperparameters of theRandom Forest model

for all products

Table 4.Tuned values for

hyperparameters of theArtificial Neural

Network model for allproducts

A BDAmethodologyfor strategic

decisions

1479

Page 14: A big data analytics based methodology for strategic ...

0.70

0.65

0.60

0.55

0.50

0.55

0.60

0.50

0.45

0.86

0.88

0.90

0.92

0.94

0.84

Not_Tuned

Tuned

Not_Tuned

Tuned

Not_Tuned

Tuned

84.18.10

84.18.40

84.18.50

(a)

(b)

(c)

Figure 6.Before tuning and aftertuning R2 results of theRandom Forest modelfor all products

JEIM33,6

1480

Page 15: A big data analytics based methodology for strategic ...

Product 84.18.40. However, for Product 84.18.40, the median R2 value is increased to 0.531when RF is used. Similar to Product 84.18.40, the ANNmodel yields a smallerR2 value (0.374)for 84.18.10. And, the highest medianR2 value of the ANNmodel is 0.802 for product 84.18.50,but the RF model achieves 0.921 median R2 value for that product. Therefore, according tothese results, the RF model achieves higher R2 values and performs better than theANN model.

After tuning the hyperparameters and selecting the models for all products, the next stepis to apply thesemodels to generate the BCGMatrix view. Our case study considers only threeexport products from China to the United Kingdom, therefore, our BCG Matrix has threecomponents. For demonstrating ourmethodology, the snapshot of January 2018 is arbitrarilyselected as the reference date. Colored spots in the BCG matrix show the current situation asof January 2018, and the grey spots represent the forecasted market situations of April 2018.Note that the forecast horizon is chosen as three months. The determination of the growthrate in the BCGmatrix is an important issue because the increase (or decrease) from a specificmonth to the next one can be very large. Since the monthly fluctuations of the trade volumecan be erratic, it can cause artificially high (or low) growth rate calculations, and therefore,mislead the BCGMatrix. To overcome this problem, the growth rate is calculated by using theaverages of the two consecutive quarters (the period of threemonths), as explained in detail inTable 5. In these calculations, the company sales to the UK are assumed to be $500,000,$200,000 and $300,000 for the products 84.18.10, 84.18.40 and 84.18.50, respectively.Similarly, the averages of the forecasts of the successive quarters are taken as the forecastedmarket sizes, as shown in Table 6. Note that Tables 5 and 6 show the calculations of thecolored and grey spots, respectively.

Using the values that we generated in Tables 6 and 7, the next step is to construct the BCGmatrix, as presented in Figure 8. According to Figure 8, the products 84.18.40 and 84.18.50 arein the Cash Cow quadrant, whereas product 84.18.10 is on the border between Cash Cow andDog. The forecasted moves of 84.18.40 and 84.18.50 remain to stay in Cash Cow Quadrant.Themain strategy recommendation for Cash Cows is tomaintain a certain investment level tokeep them in Cash Cow Quadrant so that they provide a certain level of cash flow forsupporting the company and the other business units (Kotler and Armstrong, 2018). Product84.18.10’s forecast stays on the borderline. The next moves of product 84.18.10 should becarefully analyzed. If it becomes a Cash Cow, then some investment should bemade. If it turns

Figure 7.

The median R2 values

of the tuned RF and

ANN models with

using the same training

set and different

training sets

A BDAmethodologyfor strategic

decisions

1481

Page 16: A big data analytics based methodology for strategic ...

Productcode

Year-month

Companysales to theUK ($ 1,000)

ExportvolumefromChina tothe UK ($1,000)

Quarterlyaverage oftradevolume ($1,000)

Growth rate of tradeflow

Relativemarket share

(k) (t) (Y) (X) actual (X) actual (GRTF) (RMS)

84.18.10 2017–08 500 17,149 16,761 12;761− 16;76116;761 ¼ − 23:9% 500

16;761 ¼ 3:9%2017–09 500 16,0722017–10 500 17,0632017–11 500 16,218 12,7612017–12 500 11,3292018–01 500 10,737

84.18.40 2017–08 200 2,152 2,139 1;730− 2;1392;139 ¼ −19:1% 200

1;730 ¼ 11:6%2017–09 200 2,1872017–10 200 2,0772017–11 200 1,895 1,7302017–12 200 1,8132018–01 200 1,481

84.18.50 2017–08 300 4,837 3,689 3;573− 3;6893;689 ¼ −3:1% 300

3;573 ¼ 8:4%2017–09 300 3,2262017–10 300 3,0032017–11 300 1,973 3,5732017–12 300 5,0632018–01 300 3,683

Productcode

Year-month

Companysales to theUK ($ 1,000)

Exportvolume fromChina to theUK ($ 1,000)

Quarterlyaverage oftradevolume ($1,000)

Growth rate of tradeflow

Relativemarket share

(k) (t) (Y)

(X) actualand (X0)forecasted (X) and (X

0) (GRTF) (RMS)

84.18.10 2017–11 500 16,218 12,761 12;581− 12;76112;761 ¼ −1:4% 500

12;581 ¼ 4%2017–12 500 11,3292018–01 500 10,7372018–02 500 12,722 12,5812018–03 500 12,1522018–04 500 12,869

84.18.40 2017–11 200 1,895 1,730 1;642− 1;7301;730 ¼ −5:1% 200

1;642 ¼ 12:2%2017–12 200 1,8132018–01 200 1,4812018–02 200 1,653 1,6422018–03 200 1,5662018–04 200 1,707

84.18.50 2017–11 300 1,973 3,573 3;304− 3;5733;304 ¼ −7:5% 300

3;304 ¼ 9:1%2017–12 300 5,0632018–01 300 3,6832018–02 300 3,222 3,3042018–03 300 3,3612018–04 300 3,329

Table 5.The calculations of thecolored spots (realdata) of the BCGMatrix components asof January 2018

Table 6.The calculations of thegrey spots (forecasteddata) of the BCGMatrix components forApril 2018

JEIM33,6

1482

Page 17: A big data analytics based methodology for strategic ...

into a Dog, a downsizing plan can be applied because this product has a large share in sales.For all three products, themarket continues to get smaller (negative growth rate). For product84.18.40, the figure indicates that the market shrinkage slows down from�19.1% to�5.1%.For product 84.18.10, a similar move is expected. On the other hand, market shrinkageslightly accelerates for product 84.18.50 (from�3.1% to�7.5%). Our case study is limited tothree products that are mainly located in Cash CowQuadrant. If any product appears in otherquadrants, different strategies and decisions for strategic market analysis should be applied,as explained in detail in Section 4.1.

5.1 DiscussionOurmodel utilizes thewell-knownBCGmatrix, which is one of themost widely-used portfoliomanagement techniques in practice and the literature (Morrison and Wensley, 1991; Nippaet al., 2011). The proposed methodology accurately forecasts trade volumes betweencountries and helps to identify future trends and market situations. Thus, the case studyshows that the proposed model helps to conduct strategic market analysis effectively.

In our study, the main factors that affect trade volume are identified as both product-specific (trade volume) and global (economic, business, political and trade environment)factors. Also, in accordance with the results of the nonparametric forecasting literature(Alam, 2019; Behrens, 2019; Co and Boosarawongse, 2007; Devyatkin et al., 2018; Ertugruland Tagluk, 2018; Gupta and Kashyap, 2015; Sokolov-Mladenovi�c et al., 2016), both ANN andRF yielded accurate forecasts in our case study with three products. However, RF performedbetter than ANN in terms of forecast accuracy and robustness of the results.

Table 7 summarizes the results of the study and discusses the findings regarding ourresearch questions.

6. Conclusion and research implicationsToday’s increased global competition forces companies to make better predictions andstrategic decisions considering their business environments. Boston Consulting Group

Research question Result

How can we use open data sources in strategicmarket analysis for exporters/importers?

International trade data (export/import volumes) canbe used as market information. In addition to this,company sales, market growth rate, market share andother product specific and global (business, economicand political environment) data can be used for tradeforecasting (Table 1)

Which strategic management models and open datacan be used together for strategic market analysis?

BCGmatrix is the most suitable model among variousstrategic management models (BCG Matrix, AnsoffMatrix, Positioning Map, etc.) because it shows themarket growth rate and market share of a companyfor a given product. For calculating these twocomponents of the BCG matrix, trade volumes can beused

How can we create additional value by using BigData Analytics in the selected strategic marketanalysis method?

The BCG matrix only analyzes the current marketsituation. BDA allows companies to include theforecasted positions of the components in the BCGmatrix. By also including the forecasted trade values,the improved BCG matrix of a company depicts thecurrent and future positions of the company’sproducts (Figure 8)

Table 7.Summary of the results

A BDAmethodologyfor strategic

decisions

1483

Page 18: A big data analytics based methodology for strategic ...

Matrix is one of the most well-known management tools that revolutionized strategicmanagement. However, it has some issues in practice, such as its difficulty in calculatingmarket share and market growth rate. Also, it shows only the current business environmentand does not give any insight into the future. By accurately predicting the future businessenvironment, some future insight information can be added to the BCG Matrix. Thus,companies can identify new trade and business opportunities from the forecasts of the tradevolumes between countries. However, accurate forecasting has become significantly harderdue to the increased complexity of the globalization and competition between countries andsupply chains. Using BDA, accurate forecasts can be achieved and these forecasts can beused in strategic market analysis. In this study, we propose a holistic methodology forstrategic decision making using BDA.

6.1 Theoretical contributionsThe main theoretical contribution is the development of a holistic methodology for strategicmarket analysis using BDA. Our methodology uses machine learning methods and variousopen international data to forecast international trade volumes between countries. By usingthese forecasts as the future market data for exporters (or importers), some future insightinformation is added to the BCG Matrix for strategic market analysis. To the best of our

Figure 8.The generated BCGmatrix for products84.18.10, 84.18.40 and84.18.50

JEIM33,6

1484

Page 19: A big data analytics based methodology for strategic ...

knowledge, this is the first study to use Big DataAnalytics andmachine learningmethods forstrategic market analysis.

As another contribution, we develop a forecast model that uses two machine learningalgorithms, RF and ANN. Also, the proposed nonparametric forecasting model contributes tothe literature by using more variety (and amount) of data sources and machine learningfeatures than the existing studies. The proposed preliminary analysis and tuning process toimprove forecast accuracy during the machine learning phase are the other contributions forsolving similar forecasting problems. In addition, we show that RF performs better thanANNin terms of trade volume forecasting.

6.2 Practical contributionsTo demonstrate our proposed methodology, we presented a case study of a hypotheticalChinese company exporting refrigerators and freezers to the United Kingdom. At the firststage of our methodology, the BCG Matrix is applied with the international trade data as thecurrent market situation. Then, we add the forecasted future values to the BCG Matrix byusing our proposed Big Data Analytics method. To test the efficiency of the proposedmethodology, three different sub-product groups within the main product group ofrefrigerators and freezers are selected in the case study. For forecasting the trade volumes, 28different factors are considered, and the data (ranged from 2006 April to 2018 March) of eachfactor are obtained from openly available data sources (OECD, International Trade Center,etc.). RF and ANN methods are used as the main forecasting algorithms.

Our case study proves that the potential users (exporters or importers) can easily adaptthe proposed methodology to align their strategic market decisions according to the currentand future market trends. The results show that the RFmodel yields more accurate forecaststhan the ANN model. However, both RF and ANN models provide successful forecasts fortrade volumes. Identifying unimportant features, transforming the dependent variable withlogarithm and adding past trade volume information contribute significantly to the accurateforecasts. Instead of using all features obtained from the data sources, using 50% or 75% ofthe features improves the forecasting accuracy for all products in both RF and ANNmodels.The results also show that the tuning process helps to find better and more robust results.Therefore, feature selection and tuning procedures improve the forecast accuracy for allproducts.

As demonstrated in the case study, foreseeing market conditions gives importantmanagerial insights for companies. Companies canmake better strategic decisions accordingto future market trends and seize business opportunities. For example, a company canstrategically increase (decrease) the investment level for a product if it sees a significantmarket growth (shrinkage). Companies can also plan different strategies (such as grow,retain, harvest, and exit) for their product portfolios within different market segments. Morespecifically, the company can act according to the growth (and shrinkage) of the market andits market share. For example, if a market shrinkage is predicted, the company may try toreduce production or decrease its inventory levels through sales promotions or pricediscounts. Then, the idle production capacity can be diverted to more promising productswhere market growth is expected. If the market shrinkage continues, the company mayconsider looking for new markets (countries in our case study) to sell its products or evenabandoning the production of that product in the long run. On the other hand, if marketgrowth is predicted for a product, the company can plan to increase production capacity andmake new contracts with suppliers to satisfy raw material requirements. The company canalso make new marketing campaigns and promotions to increase sales and benefit fromeconomies-of-scale. If the market share is predicted to fall, then the company may try toincrease sales by price promotions, diversifying sales channels, and increasing the quality of

A BDAmethodologyfor strategic

decisions

1485

Page 20: A big data analytics based methodology for strategic ...

the product. The company may even start partnerships and strategic alliances with the localcompanies to introduce new distribution channels and use the market knowledge of the localpartners for better market penetration. All these strategies may be used together as they helpcompanies to gain a competitive advantage and survive in the international markets.

Depending on the purpose and the available data sets, the steps of the proposedframework can easily be applied in other real-life applications within the supply chain, suchas demand management, supplier risk management, product quality management, andpredictive maintenance.

6.3 Limitations and future researchFor demonstrating the effectiveness of the proposed methodology, this study presents onlyone case of a hypothetical Chinese company. For future work, our methodology can be testedon other country pairs with the same product types. As another extension, the proposedmethodology can be applied to different products and country pairs to identify the significantfactors affecting the bilateral trade volumes between countries. This study is focused onforeseeing three months ahead; however, other forecast horizons can be investigated to testthe proposed methodology. Lastly, other machine learning methods (e.g. Long Short TermMemory) can be used to compare with RF and ANN.

References

Akhtar, S. (2003), “Is there seasonality in Pakistan’s merchandise exports and imports? The univariatemodelling approach”, Pakistan Development Review, Vol. 42 No. 1, pp. 59-75.

Akter, S., Bandara, R., Hani, U., Fosso Wamba, S., Foropon, C. and Papadopoulos, T. (2019),“Analytics-based decision-making for service systems: a qualitative study and agenda forfuture research”, International Journal of Information Management, Vol. 48, pp. 85-95.

Alam, T. (2019), “Forecasting exports and imports through artificial neural network andautoregressive integrated moving average”, Decision Science Letters, Vol. 8 No. 3, pp. 249-260.

Ayankoya, K., Calitz, A.P. and Greyling, J.H. (2016), “Real-time grain commodities price predictions inSouth Africa: a big data and neural networks approach”, Agrekon, Vol. 55 No. 4, pp. 483-508.

Barbosa, M.W., Vicente, A.D.L.C., Ladeira, M.B. and Oliveira, M.P.V.D. (2018a), “Managing supplychain resources with Big Data Analytics: a systematic review”, International Journal of LogisticsResearch and Applications, Vol. 21 No. 3, pp. 177-200.

Barbosa, M.W., Vicente, A.D.L.C., Ladeira, M.B. and Oliveira, M.P.V.D. (2018b), “Managing supplychain resources with Big Data Analytics: a systematic review”, International Journal of LogisticsResearch and Applications, Vol. 21 No. 3, pp. 177-200.

Behrens, C. (2019), “A nonparametric evaluation of the optimality of German export and importgrowth forecasts under flexible loss”, Economies, Vol. 7 No. 3, pp. 93-116.

Bovi, M. and Cerqueti, R. (2016), “Forecasting macroeconomic fundamentals in economic crises”,Annals of Operations Research, Vol. 247 No. 2, pp. 451-469.

Breiman, L. (2001), “Random forests”, Machine Learning, Vol. 45 No. 1, pp. 5-32.

Cao, G. and Duan, Y. (2017), “How do top- and bottom-performing companies differ in using businessanalytics?”, Journal of Enterprise Information Management, Vol. 30 No. 6, pp. 874-892.

Chen, C.P. and Zhang, C.-Y. (2014), “Data-intensive applications, challenges, techniques andtechnologies: a survey on big data”, Information Sciences, Vol. 275, pp. 314-347.

Co, H.C. and Boosarawongse, R. (2007), “Forecasting Thailand’s rice export: statistical techniques vs.artificial neural networks”, Computers and Industrial Engineering, Pergamon, Vol. 53 No. 4,pp. 610-627.

JEIM33,6

1486

Page 21: A big data analytics based methodology for strategic ...

Cui, L., Song, M. and Zhu, L. (2019), “Economic evaluation of the trilateral FTA among China, Japan,and South Korea with big data analytics”, Computers and Industrial Engineering, Vol. 128,pp. 1040-1051.

Dale, C. and Bailey, V.B. (1982), “A Box-Jenkins model for forecasting US merchandise exports”,Journal of International Business Studies, Vol. 13 No. 1, pp. 101-108.

Devyatkin, D., Suvorov, R., Tikhomirov, I. and Otmakhova, Y. (2018), “Neural networks for foodexport gain forecasting”, 2018 International Conference on Intelligent Systems (IS), Funchal -Madeira, Portugal, pp. 312-317.

Doganis, P., Alexandridis, A., Patrinos, P. and Sarimveis, H. (2006), “Time series sales forecasting forshort shelf-life food products based on artificial neural networks and evolutionary computing”,Journal of Food Engineering, Vol. 75 No. 2, pp. 196-204.

Duan, Y., Edwards, J.S. and Dwivedi, Y.K. (2019), “Artificial intelligence for decision making in the eraof Big Data - Evolution, challenges and research agenda”, International Journal of InformationManagement, Vol. 48, pp. 63-71.

Emang, D., Shitan, M., Abd Ghani, A.N. and Noor, K.M. (2010), “Forecasting with univariate timeseries models: a case of export demand for peninsular Malaysia’s moulding and chipboard”,Journal of Sustainable Development, Vol. 3 No. 3, pp. 157-161.

Ertugrul, O.F. and Tagluk, M.E. (2018), “Forecasting financial indicators by generalized behaviorallearning method”, Soft Computing, Vol. 22 No. 24, pp. 8259-8272.

Gr€omping, U. (2009), “Variable importance assessment in regression: linear regression versus randomforest”, The American Statistician, Vol. 63 No. 4, pp. 308-319.

Gupta, S. and Kashyap, S. (2015), “Forecasting inflation in G-7 countries: an application of artificialneural network”, Foresight, Vol. 17 No. 1, pp. 63-73.

Hassani, H. and Silva, E.S. (2015), “Forecasting with big data: a review”, Annals of Data Science, Vol. 2No. 1, pp. 5-19.

Hazen, B.T., Skipper, J.B., Boone, C.A. and Hill, R.R. (2018), “Back in business: operations research insupport of big data analytics for operations and supply chain management”, Annals ofOperations Research, Vol. 270 Nos 1-2, pp. 201-211.

Hu, J. and Zhang, Y. (2018), “Measuring the interdisciplinarity of Big Data research: a longitudinalstudy”, Online Information Review, Vol. 42 No. 5, pp. 681-696.

Hu, H., Wen, Y., Chua, T. and Li, X. (2014), “Toward scalable systems for big data analytics: atechnology tutorial”, IEEE Access, Vol. 2, pp. 652-687.

Kaastra, I. and Boyd, M. (1996), “Designing a neural network for forecasting financial and economictime series”, Neurocomputing, Vol. 10 No. 3, pp. 215-236.

Kargbo, J.M. (2007), “Forecasting agricultural exports and imports in South Africa”, AppliedEconomics, Vol. 39 No. 16, pp. 2069-2084.

Keck, A., Raubold, A. and Truppia, A. (2010), “Forecasting international trade: a time seriesapproach”, OECD Journal: Journal of Business Cycle Measurement and Analysis, Vol. 2009 No. 2,pp. 157-176.

Khan, T. (2011), “Identifying an appropriate forecasting model for forecasting total import ofBangladesh”, Statistics in Transition New Series, Vol. 12 No. 1, pp. 179-192.

Kotler, P. and Armstrong, G. (2018), Principles of Marketing, Global Edition, 17th, Pearson, London.

Kuo, R.J. and Li, P.S. (2016), “Taiwanese export trade forecasting using firefly algorithm based K-means algorithm and SVR with wavelet transform”, Computers and Industrial Engineering,Vol. 99, pp. 153-161.

Laptev, N., Yosinski, J., Li, E.L. and Smyl, S. (2017), Time-series Extreme Event Forecasting with NeuralNetworks at Uber, International Conference on Machine Learning, Sydney, pp. 1-5.

A BDAmethodologyfor strategic

decisions

1487

Page 22: A big data analytics based methodology for strategic ...

Mishra, S., Mishra, D. and Santra, G.H. (2016), “Applications of machine learning techniques inagricultural crop production: a review paper”, Indian Journal of Science and Technology, Vol. 9No. 38, pp. 1-14.

Mishra, D., Gunasekaran, A., Papadopoulos, T. and Childe, S.J. (2018), “Big Data and supply chainmanagement: a review and bibliometric analysis”, Annals of Operations Research, Vol. 270 Nos1-2, pp. 313-336.

Morrison, A. and Wensley, R. (1991), “Boxing up or boxed in?: a short history of the boston consultinggroup share/growth matrix”, Journal of Marketing Management, Vol. 7 No. 2, pp. 105-129.

Nguyen, T., Zhou, L., Spiegler, V., Ieromonachou, P. and Lin, Y. (2018), “Big data analytics in supplychain management: a state-of-the-art literature review”, Computers and Operations Research,Vol. 98, pp. 254-264.

Nippa, M., Pidun, U. and Rubner, H. (2011), “Corporate portfolio management: appraising four decadesof academic research”, Academy of Management Perspectives, Vol. 25 No. 4, pp. 50-66.

Nummelin, T. and H€anninen, R. (2016), “Model for international trade of sawnwood using machinelearning models”, Natural Resources and Bioeconomy Studies, Vol. 74, pp. 1-35, available at:http://jukuri.luke.fi/handle/10024/537749.

Pakravan, M.R., Kelashemi, M.K. and Alipour, H.R. (2011), “Forecasting Iran’s rice imports trendduring 2009-2013”, International Journal of Agricultural Management and Development, Vol. 1,pp. 39-44.

Palmer, A., Monta~no, J.J. and Ses�e, A. (2006), “Designing an artificial neural network for forecastingtourism time series”, Tourism Management, Vol. 27 No. 5, pp. 781-790.

Pannakkong, W., Huynh, V.-N. and Sriboonchitta, S. (2016), “ARIMA versus artificial neural networkfor Thailand’s cassava starch export forecasting”, Causal Inference in Econometrics, Vol. 622,Springer, Cham, pp. 255-277.

Sagaert, Y.R., Aghezzaf, E.-H., Kourentzes, N. and Desmet, B. (2018), “Temporal big data for tacticalsales forecasting in the tire Industry”, Interfaces, Vol. 48 No. 2, pp. 121-129.

Sagaert, Y.R., Kourentzes, N., De Vuyst, S., Aghezzaf, E.-H. and Desmet, B. (2019), “Incorporatingmacroeconomic leading indicators in tactical capacity planning”, International Journal ofProduction Economics, Vol. 209, pp. 12-19.

Saggi, M.K. and Jain, S. (2018), “A survey towards an integration of big data analytics to big insightsfor value-creation”, Information Processing and Management, Vol. 54 No. 5, pp. 758-790.

Sahu, P.K. and Mishra, P. (2013), “Modelling and forecasting production behaviour and import- exportof total spices in two most populous countries of the World”, Journal of Agricultural Research,Vol. 51 No. 1, pp. 81-97.

Sanders, N.R. (2014), Big Data Driven Supply Chain Management: A Framework for ImplementingAnalytics and Turning Information into Intelligence, Pearson Education, New Jersey.

Senhadji, A.S. and Montenegro, C.E. (1999), “Time series analysis of export demand equations: a cross- country analysis”, IMF Staff Papers, Vol. 46 No. 3, pp. 259-273.

Shibasaki, R. and Watanabe, T. (2012), “Future forecast of trade amount and international cargo flowin the APEC Region: an application of Trade-Logistics Forecasting Model”, Asian TransportStudies, Vol. 2 No. 2, pp. 194-208.

Shirazi, F. and Mohammadi, M. (2019), “A big data analytics model for customer churn predictionin the retiree segment”, International Journal of Information Management, Vol. 48,pp. 238-253.

Silva, E.S. and Hassani, H. (2015), “On the use of singular spectrum analysis for forecasting US tradebefore, during and after the 2008 recession”, International Economics, Vol. 141, pp. 34-49.

Sivarajah, U., Kamal, M.M., Irani, Z. and Weerakkody, V. (2017), “Critical analysis of Big Datachallenges and analytical methods”, Journal of Business Research, Vol. 70, pp. 263-286.

JEIM33,6

1488

Page 23: A big data analytics based methodology for strategic ...

Sokolov-Mladenovi�c, S., Milovan�cevi�c, M., Mladenovi�c, I. and Alizamir, M. (2016), “Economic growthforecasting by artificial neural network with extreme learning machine based on trade, importand export parameters”, Computers in Human Behavior, Vol. 65, pp. 43-45.

Stahlbock, R. and Voß, S. (2010), “Improving empty container logistics – can it avoid a collapse incontainer transportation?”, in Kroon, L., Li, T. and Zuidwijk, R. (Eds), Liber Amicorum InMemoriam Jo Van Nunen, Rotterdam School of Management, Erasmus University, Rotterdam,pp. 217-224.

Surbakti, F.P.S., Wang, W., Indulska, M. and Sadiq, S. (2019), “Factors influencing effective use of bigdata: a research framework”, Information and Management, Vol. 57 No. 1, p. 1031463.

Tiwari, S., Wee, H.M. and Daryanto, Y. (2018), “Big data analytics in supply chain managementbetween 2010 and 2016: insights to industries”, Computers and Industrial Engineering, Vol. 115,pp. 319-330.

Tk�a�c, M. and Verner, R. (2016), “Artificial neural networks in business: two decades of research”,Applied Soft Computing Journal, Vol. 38, pp. 788-804.

UN Trade Statistics. (2017), “Harmonized commodity description and coding system (HS)”, available at:https://unstats.un.org/unsd/tradekb/Knowledgebase/50018/Harmonized-Commodity-Description-and-Coding-Systems-HS (accessed 28 July 2019).

Veenstra, A.W. and Haralambides, H.E. (2001), “Multivariate autoregressive models for forecastingseaborne trade flows”, Transportation Research Part E: Logistics and Transportation Review,Vol. 37 No. 4, pp. 311-319.

Verma, S. and Bhattacharyya, S.S. (2017), “Perceived strategic value-based adoption of Big DataAnalytics in emerging economy”, Journal of Enterprise Information Management, Vol. 30 No. 3,pp. 354-382.

Wang, S., Li, J. and Zhao, D. (2017), “Understanding the intention to use medical big data processingtechnique from the perspective of medical data analyst”, Information Discovery and Delivery,Vol. 45 No. 4, pp. 194-201.

Wirth, R. and Hipp, J. (2000), “CRISP-DM: towards a standard process model for data mining”, FourthInternational Conference on the Practical Application of Knowledge Discovery and Data Mining,Manchester, UK, pp. 29-39.

Yang, R., Yu, L., Zhao, Y., Yu, H., Xu, G., Wu, Y. and Liu, Z. (2020), “Big data analytics for financialMarket volatility forecast based on support vector machine”, International Journal ofInformation Management, Vol. 50, pp. 452-462.

Yaqoob, I., Hashem, I.A.T., Gani, A., Mokhtar, S., Ahmed, E., Anuar, N.B. and Vasilakos, A.V. (2016),“Big data: from beginning to future”, International Journal of Information Management, Vol. 36No. 6, pp. 1231-1247.

Zhang, G., Patuwo, B.E. and Hu, M.Y. (1998), “Forecasting with artificial neural networks: the state ofthe art”, International Journal of Forecasting, Vol. 14 No. 1, pp. 35-62.

Zhou, L., Pan, S., Wang, J. and Vasilakos, A.V. (2017), “Machine learning on big data: opportunitiesand challenges”, Neurocomputing, Vol. 237, pp. 350-361.

About the authorsMurat €Ozemre is graduated 1996 from Middle East Technical University, Dept. of AeronauticalEngineering. After graduation, he started to work at ROKETSAN for the Simulation and Controlsystems group and participated inNATOprojects onmissile simulations. He got theM.S. in 2000 and theMBA in 2001. Then, he started to work in STM Inc. (Defense Technologies and Engineering). He iscurrently working as Software Development head of BIMAR IT Services. Projects are mainly inContainer Liner, Agency, Depot Management and Terminal Management business domains. Since 2013he has been doing his Ph.D. studies at Yasar University.

Ozgur Kabadurmus, received the Ph.D. and M.S. in Industrial and Systems Engineering fromAuburn University, USA in 2013 and 2010, respectively. He started his academic career as a Teachingand Research Assistant at Istanbul Technical University and Auburn University. Then, he continued

A BDAmethodologyfor strategic

decisions

1489

Page 24: A big data analytics based methodology for strategic ...

his academic career as Assistant Professor at Yasar University. He is currently working as a full-timeacademician at the Faculty of Business, Department of International Logistics Management at YasarUniversity. His main research areas are applied operations research/metaheuristic optimization, theanalysis and design of supply chain systems, and big data analytics. Ozgur Kabadurmus is thecorresponding author and can be contacted at: [email protected]

For instructions on how to order reprints of this article, please visit our website:www.emeraldgrouppublishing.com/licensing/reprints.htmOr contact us for further details: [email protected]

JEIM33,6

1490