Research - Xiao, Haotian
Publication
N. Xu, H. Xiao, Y. Zhu, X. Chen, Y. Li, X. Hu*, "A Novel End-to-end Framework for A-share Stock Market Portfolio Optimization Considering Risk Measure and Feature Exposure", International Conference on Big Data Technologies, Sept. 2024. [pdf][code]
- The QBOAP framework employs the Quality-Growth-Momentum-Sentiment (QGMS) feature set to capture a broad spectrum of market information. These features are designed to be representative and explanatory, ensuring comprehensive coverage of market dynamics.
- The framework integrates a novel loss function, accounting for return curve stability and feature exposure, optimized through Bayesian optimization for optimal asset weights. Additionally, it incorporates a weight rank-based trading strategy to minimize transaction costs and improve trading value, achieving a CAGR of 21.6% and a Sharpe ratio of 1.46.
Figure 1: Performance Metric Comparison
Graduation Thesis
H. Xiao, Q. Lin*, "Factor Zoo: Tame the Level2 Indicator in Cross-Section", Hongyi Honor College, Wuhan University, Graduation Thesis, May 2024.
- StockMaster Framework Introduction: Presents StockMaster, a novel framework for stock investment opportunity mining that integrates HTFRESH, SMB feature engineering, and TSNET to extract multi-force characteristics from level2 data, enhancing investment decision-making in quantitative investing.
- High-Frequency to Low-Frequency Feature Extraction: Introduces a framework to effectively mine high-frequency stock information, reducing noise and ineffective factors, thereby improving model prediction accuracy and offering more possibilities for investment decisions.
- TSNET Model and Backtesting Results: Features the TSNET model, which uses multi-adaptive graph convolution and multi-head self-attention to predict stock market trends, achieving an annualized return of 50.29% and a Sharpe ratio of 3.223, demonstrating the framework's effectiveness in improving investment accuracy and efficiency.
Figure 1: StockMaster Workflow
Under Review
R. Yu, H. Xiao, G. Zhang*, "Trading Value of Volatility Index: Empirical Evidence from Chinese Option Market". In peer review of Pacific-Basin Finance Journal.[code]
- This paper adopts a new method for constructing a generalized volatility index (GVIX) for the Chinese market, which is not based on the assumption of geometric Brownian motion underlying the traditional VIX index. The article examines the properties of GVIX in the Chinese market with reference to the existing literature. Finally, the study compares GVIX with other volatility indices by evaluating the performance of option trading strategies using each index as the implied volatility input. The results demonstrate that GVIX-based strategies achieve the highest annualized returns of 47.62% and a competitive Sharpe ratio of 1.94, suggesting that GVIX offers effective volatility information for option pricing.
Figure 1: Trading Strategy Comparison
S. Shang, H. Xiao, B. Shen*, "Nash Equilibrium of First Price Auction with Different Participation Cost". In peer review of Games and Economic Behavior.[code]
Proposition 1: (Existence and Uniqueness Theorem)
For the independent private values economic environment with two bidders who have different participation costs \(c_2 > c_1\), we have the following conclusions:
- (1) There always exists a monotonic equilibrium.
- (2) Suppose \(F(\cdot)\) is concave. Then the equilibrium is unique and monotonic.
- (3) Suppose \(F(\cdot)\) is strictly convex. Then:
- (3i) The monotonic equilibrium is unique when the reverse hazard rate of \(F(\cdot)\), that is, when \(\frac{f(\cdot)}{F(\cdot)}\) is nonincreasing.
- (3ii) The nonmonotonic equilibrium is unique when \(c_1 = c_m\).
- (3iii) There is no nonmonotonic equilibrium when \(c_1 < c_m\).
- (3iv) There are at least two nonmonotonic equilibria when \(c_m < c_1 < c_2\).
Proposition 2: (Limit Theorem)
For the independent private values economic environment with two bidders having participation costs \(c_2 > c_1\), we have the following conclusions:
- (1) Suppose \(F(\cdot)\) is concave. The unique monotonic equilibrium (no nonmonotonic equilibrium) converges to the unique symmetric equilibrium as \(c_2 - c_1 \rightarrow 0\).
- (2) Suppose \(F(\cdot)\) is strictly convex with nonincreasing reverse hazard rate. The unique monotonic equilibrium converges to an asymmetric equilibrium as \(c_2 - c_1 \rightarrow 0\).
- (3) Suppose \(F(\cdot)\) is strictly convex. When \(c_2 - c_1 \rightarrow 0\), there are two nonmonotonic equilibria, of which one converges to the unique symmetric equilibrium and the other converges to an asymmetric equilibrium.
Proposition 3: (Comparative Static Theorem)
For the independent private values economic environment with two bidders, suppose the values of bidders are drawn from a distribution function \(F(\cdot)\) and the participation costs \(c_1\) and \(c_2\) are common knowledge. Then for the monotonic equilibrium:
- (1)An increase in participation cost \(c_i\) increases \(i^{\prime}\)s cutoff \(v_i^{*}\) but decreases the opponent's cutoff \(v_j^{*}\) for \(j \neq i\).
- (2)Specially, when \(F(\cdot)\) is concave, which gives us a unique and monotonic equilibrium, an increase in participation cost \(c_i\) increases \(i^{\prime}\)s cutoff \(v_i^{*}\) but decreases the opponent's cutoff \(v_j^{*}\) for \(j \neq i\).
Y. Zhu, H. Xiao, Y. Li, C. Yu*, X. Wang, and W. Cui, "Integration of LoRAS-ENN data augmentation and interpretable stacked learning in credit fraud detection". In peer review of Big Data Research.
- Enhanced Data Reconstruction with LoRAS-ENN: The proposed LoRAS-ENN strategy effectively addresses the data imbalance problem inherent in credit card fraud detection by combining the Localised Random Affine Shadow Sampling (LoRAS) algorithm with the Edited Nearest Neighbours (ENN) technique. LoRAS enhances the dataset by generating synthetic fraud cases, while ENN removes redundant or ambiguous samples, improving the overall quality of the reconstructed data. This approach leads to a notable improvement in model performance, with the F1 score rising from 0.79 to 0.84 after applying LoRAS-ENN to the dataset.
- Superior Model Performance with Stacking Ensemble: The ensemble learning model built using a stacking strategy demonstrates significant improvements in fraud detection accuracy. When compared to twelve other models, the stacking model achieves an Area Under the Curve (AUC) greater than 0.987, which highlights its superior ability to differentiate between fraudulent and non-fraudulent transactions. The stacking approach enhances the model's generalization and robustness, making it a powerful tool for real-world fraud detection tasks.
- Interpretability with SHAP: The incorporation of SHapLey Additive exPlanations (SHAP) helps improve the interpretability of the ensemble model by providing insights into the relative importance of different features in predicting fraud. SHAP analysis reveals that features V17, V4, and V14 have the highest contribution to the model’s predictions, allowing stakeholders to better understand which factors influence fraud detection and enabling more transparent decision-making processes in the context of financial fraud prevention.
Figure 1: Main Workflow
Working Papers
- Financial Volatility Forecasting via Deep Learning
- Constructed improved transformer LAHGNet by adopting Adaptive Spectral Block for frequency domain representation and heterogeneous graph attention layer to aggregrate information from different neighbors.
- Applied LAHGNet in Chinese major stock index volatility prediction, compared performance with MSGNet, TimesNet, PatchTST and other cutting-edged deep learning model.
- Financial Volatility Forecasting via FinBERT
- Adopted Wind Information text and AdapterBert strategy to fine-tune FinBert.
- Applied tuned FinBERT for news sentiment analysis to build stock index sentiment indices as feature for volatility prediction.
- Constructed corresponding option pricing strategy.
- Do City Brandings Promote Urban Development? A Quasi-natural Experimental Study Represented by the Hostings of the World Expo and the Asian Games. [manuscript][code]
- Synthetic Control Method (SCM) used to assess the economic effects of hosting global events on urban development.
- Hosting events increases economic activity, with a 2.75% rise in light intensity, driven by tourism, infrastructure, and local product sales.
- The Relationship Between Happy, Technology and Social Media. [manuscript][code]
- The article employs a combination of econometric models (logit and ordered logit regression, TSLS) and explainable machine learning (BO-XGBOOST-SHAP) to analyze the relationship between happiness, income, technology use, and social media engagement based on GSS data.
- Income has a diminishing impact on happiness beyond a certain threshold, and the use of technology and social media significantly influences happiness levels, especially at lower and higher ends of the income spectrum.
|