Due to the high value and high failure rate of startups, predicting their success has become a critical challenge across interdisciplinary research. Existing approaches typically model success prediction from the perspective of a single decision-maker, overlooking the collective dynamics of investor groups that dominate real-world venture capital (VC) decisions. In this paper, we propose SimVC-CAS, a novel collective agent system that simulates VC decision-making as a multi-agent interaction process. By designing role-playing agents and a GNN-based supervised interaction module, we reformulate startup financing prediction as a group decisionmaking task, capturing both enterprise fundamentals and the behavioral dynamics of potential investor networks. Each agent embodies an investor with unique traits and preferences, enabling heterogeneous evaluation and realistic information exchange through a graph-structured co-investment network. Using real-world data from PitchBook and under strict data leakage controls, we show that SimVC-CAS significantly improves predictive accuracy while providing interpretable, multiperspective reasoning, for example, approximately 25% relative improvement with respect to average precision@10. SimVC-CAS also sheds light on other complex group decision scenarios.
📄 Full Content
Startups contributed approximately 60% of the global new jobs between 2010 and 2020 (Union 2019), yet their failure rate remains high -estimated at around 90% (Kerr, Nanda, and Rhodes-Kropf 2014). This significant contrast has made the prediction of early-stage startup success (e.g., financing outcomes and IPO potential) a focal point of interdisciplinary research. Among various success indicators, early-stage financing plays a critical role as the "lifeline" for startup survival and growth (Cassar 2004). Access to capital not only provides essential financial support but also shapes long-term strategic direction. Consequently, predicting whether a startup can secure financing has emerged as a core problem in startup success prediction (Zhang et al. 2021;Dellermann et al. 2021).
In venture capital (VC), investors typically make decisions based on comprehensive evaluations that consider 2022). Following this logic, prior automated methods have focused on modeling individual decision-making by extracting firm-level features using various technical approaches. Representative paths include: (i) traditional machine learning on engineered features (Krishna, Agrawal, and Choudhary 2016;Ünal and Ceasu 2019); (ii) graph neural network(GNN)-based modeling of investment networks (Zhang et al. 2021;Lyu et al. 2021); and (iii) pre-trained language models for textual analysis (Maarouf, Feuerriegel, and Pröllochs 2025). These methods all aim to simulate a single idealized investor.
Investor decisions directly determine startup financing outcome and development trajectories: investment absence leads to survival crises, while investors’ differentiated resource endowments profoundly shape future development paths (Diamond and Rajan 2001;Miloud, Aspelund, and Cabrol 2012). However, in real-world VC scenarios, startups are evaluated by multiple potential investors -not in isolation, but as part of a co-investment network. These potential investors exchange information and collaboratively shape each other’s decisions through historical ties and peer influence (Han and Yang 2013;Liu 2017;Goldstein, Xiong, and Yang 2025). Our data confirm this dynamic: the average graph distance among investors who eventually co-invest in the same startup is only 2.16, compared to an average distance of 4.23 across the entire VC network. Thus, the outcome of financing is often determined by the collective judgment of an investor group, and startups of similar quality can experience divergent fates depending on the composition of their investor collective. This “investor collective effect” is not captured by existing methods that rely on a single decision-making perspective. The multi-agent nature of real-world VC evaluations makes a multi-agent framework a natural fit for modeling such interactions.
However, existing multi-agent systems fall short when applied to VC for two key reasons: First, current frameworks often use agent specialization to simulate compartmentalized reasoning across information dimensions -assigning individual agents to analyze different dimensions of information (Yu et al. 2024;Zhang et al. 2024b;Wang, Ihlamur, and Alican 2025). However, this models the internal thought process of one investor, rather than the diverse perspectives and interactions among multiple independent agents. Second, existing agent interaction mechanisms rely on heuristics (Wang et al. 2025;Qian et al. 2025;Li, Gong, and Jiang 2025) that cannot capture the nuanced, asymmetric relationships observed in actual co-investment behavior -relationships shaped by relational proximity (e.g., graph topology), investor heterogeneity, and contextual startup features.
To address these challenges, we propose SimVC-CAS: a collective agent system for simulating real-world venture capital decision-making. The core idea is to reframe financing forecasting as the outcome of a simulated collective decision by a network of investors -each modeled as an autonomous agent. SimVC-CAS jointly models enterprise fundamentals and multi-investor decision dynamics. Each agent in SimVC-CAS represents a distinct investor. The system takes as input a target startup’s comprehensive profile and simulates (i) heterogeneous individual judgments, and (ii) collective decision updates shaped by a co-investment network. The final group-level verdict forms the basis for financing prediction. As shown in Figure 1, this framework enables joint modeling of both company-level attributes and investor-collective behaviors -offering a more realistic and interpretable approach to VC forecasting.
Specifically, SimVC-CAS consists of three modules: 1. Startup Panoramic Portrait: This module integrates diverse startup data -including basic information, founding team backgrounds, and financing history -to construct a comprehensive enterprise profile. This serves as the shared context for all investor agents. 2. Heterogeneous Investor Portraits : We construct the potential investor pool from the startup’s historical investors and their past co-investment partners. Empirically, 77.35% of investors in new financing rounds originate from this group in our dataset. Each potential investor agent is assigned a distinct profile based on basic information, experience, and historical investment behavior. Leveraging the role-playing capability of LLMs, agents interpret the startup information through personalized lenses, capturing diverse cognitive biases and decision patterns. 3. Collective Agent Interaction Modeling: In-vestor interaction patterns not only depend on their own attributes and network structure, but also dynamically change around the characteristics of the evaluated startups, forming startup-centric interaction modes (Elfring and Hulsink 2007;Lee, Kim, and Marcotte 2015). To capture such complex interactions, we design a graph attention network with virtual nodes (VGAT), where virtual nodes represent the target startup and serve as information hubs connected to all investor nodes, thereby modeling the investor interaction patterns formed around the currently evaluated startup. Investor nodes are connected through edges determined by co-investment history. Through VGAT, the system captures multi-dimensional interaction patterns -balancing investor heterogeneity, startup characteristics, and graph topology, thereby approximating realistic group deliberation.
Our contributions are as follows:
• We introduce a novel simulation-based paradigm for startup financing prediction that integrates LLM roleplaying with collective agent modeling. This transforms the prediction task into a collective decisionmaking simulation, offering a fundamentally different lens on venture capital forecasting. • We present SimVC-SAC, which jointly models company fundamentals and investor collective effect using company basic information, scalable investor resumes, and Graph-based collective agents interaction. This framework significantly enhances the prediction accuracy and provides interpretable decision-making insights in VC research. • Experiments -conducted under strict anti-leakage protocals for LLMs -demonstrate that SimVC-CAS outperforms competitive baselines in prediction accuracy and can provide interpretable VC insights. The proposed method is generalizable and can be extended to other domains involving complex heterogeneous interaction structures.
Predicting early-stage startup success is a critical task. Early research identified key predictive factors (Roure and Keeley 1990;Song et al. 2008; Sevilla-Bernardo, Sanchez-Robles, and Herrador-Alcaide 2022) which informed manual feature engineering efforts. Initial methods used commercial datasets such as PitchBook, Crunchbase to design features based on domain knowledge and heuristics for traditional machine learning models (Krishna, Agrawal, and Choudhary 2016;Ünal and Ceasu 2019;Bargagli-Stoffi, Niederreiter, and Riccaboni 2021). As large-scale datasets become available, the field evolved toward three main approaches:
(1) graph neural networks (GNNs) (Zhang et al. 2021;Lyu et al. 2021), which mine relational structures for investment patterns;
(2) pre-trained language models (LMs) (Maarouf, Feuerriegel, and Pröllochs 2025), which extract semantic features from company descriptions and filings; and (3) multi-agent systems (Wang, Ihlamur, and Alican 2025;Griffin et al. 2025), which assign different agents to analyze complementary aspects of a company’s objective in-formation, aiming to address the interpretability and scalability limitations of prior approaches. Despite these advances, existing models all focus on single-investor decision frameworks, failing to capture the multi-perspective evaluation and group dynamics typical of the real world. Even when employing multi-agent approaches, they merely use multiple agents to analyze different dimensional information (Wang, Ihlamur, and Alican 2025;Griffin et al. 2025), essentially remaining within a single decision-maker perspective. To bridge this gap, we propose SimVC-CAS, a novel financing prediction paradigm that jointly models startup fundamentals and the collective decision-making behavior of potential investors.
Large Language Models (LLMs) have shown significant potential in simulating human intelligence and are increasingly used as a foundational component in autonomous agent systems (Luo et al. 2025;He, Treude, and Lo 2025;Zhang et al. 2025). In decision simulation contexts, LLM agents have been deployed to model user search behavior and predict click-through rates (Zhang et al. 2024a), forecast legislative bill outcomes based on policymaker behavior (Li, Gong, and Jiang 2025), and replicate group voting patterns from demographic data (Argyle et al. 2023).
However, the application of LLMs to simulate investor decision-making in VC remains largely underexplored. Current approaches are insufficient for modeling the heterogeneous, relational dynamics inherent in VC investment networks. To address this gap, we propose a graph neural network-based multi-agent interaction framework for simulating VC investment decisions. Beyond VC, this framework offers a generalizable foundation for decision modeling in other complex, heterogeneous network environments.
This study focuses on predicting the success of early-stage startups. Following prior work (Zhang et al. 2021), we define early-stage startups as those have completed their first round of formal financing (seed or angel round), but have not yet raised subsequent rounds (e.g., Series A). These companies enter the venture capital ecosystem for the first time through initial investors. While success is often measured by whether a startup secures Series A funding (Zhang et al. 2021;Dellermann et al. 2021), prior studies have used different observation windows, potentially introducing time biases. To alleviate this, we adopt a consistent one-year observation window, which aligns with stage-based evaluation practices and helps control for external environmental factors such as market cycles and macroeconomic shocks (Boocock and Woods 1997). Accordingly, the core task of this study is defined as: predicting whether a startup secures subsequent funding within one year of its initial financing.
This section details the core framework of our proposed method, shown in Figure 2. We begin by constructing a comprehensive profile for each target startup, integrating basic information, team composition, and historical financing records. This profile serves as the fundamental input for investment evaluation. Meanwhile, we generate heterogeneous profiles for potential investors, encompassing personal information, investment experience, and professional background. To simulate investor decision-making, we employ an LLM-based agent to model the investment preferences of individual investors through role-playing. Building on this, we introduce a collective agent interaction system: for each startup, we construct a co-investment network linking its potential investors based on historical co-investment relationships. In addition, we designed a GAT with virtual nodes (VGAT) to capture the investor interaction patterns centered on startups in this network, enabling us to approximate the collective decision-making dynamics in the real world. This collective modeling enhances the predictive accuracy of startup success likelihood.
The foundational characteristics of a startup play a critical role in predicting its likelihood of success. To support downstream decision-making, we construct a panoramic profile that captures a holistic view of the target company. This profile serves as a comprehensive input to subsequent agents in the framework. Basic Information This includes the startup’s founding date, industry classification, basic description, geographic location, primary products, relevant keywords, and other core attributes. Team Composition We compile detailed background information on key team members, including gender, education history, prior employment records, personal investment experience, and professional roles held. Historical Financing Records This section summarizes the startup’s past funding rounds, including the amounts raised and background information on the participating investors. Affiliated Companies In VC networks, startups that share key individuals (e.g., co-investors or co-founders) with the target company often provide valuable context. We define such entities as affiliated companies, and incorporate their relevant information into the target’s panoramic profile, enabling a more complete and context-aware evaluation.
For each target startup, we construct a real-time, heterogeneous profile of its potential investors. The candidate investor set is constructed based on the following rule: We randomly sample k individuals from the union of the startup’s historical investors and their past coinvestment partners. Based on this candidate set, we generate detailed investor profiles and simulate their behavior through a role-playing agent architecture. Each agent represents a distinct investor persona, offering differentiated analysis and decision-making support. This profiling system is highly extensible and can integrate data from multiple
To better approximate real-world investor decision-making dynamics, we design a collective agent interaction framework supervised by a graph neural network. In this system, each investor agent first performs an individual evaluation of the target startup based on their profile and outputs an initial investment decision. Next, the historical co-investment network is used -along with VGAT -to determine the interaction structure among these investor agents. Finally, the agents interact according to the learned topology, updating their beliefs and assessments based on peer influence. This dynamic group reasoning process allows agents to revise their initial decisions, resulting in a final, aggregated investment decision that reflects both individual judgment and collective insight. VGAT To model investor interaction patterns shaped jointly by the characteristics of the target startup, the at-tributes of individual investors, and the topology of the coinvestment network, we design VGAT. In this architecture, the target company is introduced as a virtual node, which is connected to all investor nodes in the co-investment graph.
Formally, let G = (V, E, ω) be an undirected weighted graph with virtual nodes. The node set is defined as V = V R ∪ {d}, where V R = {v 1 , . . . , v n } represents real investor nodes and d / ∈ V R denotes the virtual node (i.e., the target company). The edge set consists of two parts:
connects the virtual node to all real nodes. A weight function ω : E → R + assigns positive weights to all edges, including configurable weights c i for virtual edges.
The VGAT model processes G through three sequential components. First, a global graph attention layer (GAT) layer updates the embeddings for all nodes:
Second, a local GAT layer refines the embeddings only for real nodes:
Concurrently, the virtual node embedding is transformed via a multi-layer perceptron (MLP) layer:
For each real edge e ij = (v i , v j ) ∈ E R , the final edge embedding is computed as:
where the base edge representation m ij is given by:
where W Q , W K , and W V are learnable parameters. This model is jointly optimized through both the crossentropy and contrastive losses:
.
(6) Cross-entropy loss is for classification, comparing predicted and actual class labels. y c ij is the true label indicating edge (v i , v j ) belongs to class c, w c is learnable weight vector for class c.
Contrastive loss is to enhance edge embedding distinctiveness, using positive and negative pairs. |P| is total number of positive sample pairs. The final loss function combines these:
among them, α is a hyperparameter.
To capture latent investor interactions, VGAT addresses the challenge of modeling latent investor interactions (where direct data is private) by innovatively using future coinvestments relationship as the only viable supervision signal. Our experiments validate its efficacy as a learning signal. (Training details are provided in the Appendix.) Investor Agent Decision-Making In this section, we detail the decision-making process of the investor agents. Given a set of k investor agents {A 1 , A 2 , . . . , A k }, we first construct respective heterogeneous profile representation P i for each investor A i . Through role-playing mechanisms, conditioned on the startup profile C, each agent generates an initial independent decision D (0) i using a frozen LLM:
To model interactions among investors, we construct a coinvestment network based on historical deal records. This network is formalized as an undirected, weighted graph:
representing real investor nodes and C denoting the virtual node corresponding to the target startup. The edge set E consists of: E R ⊆ V R × V R , encoding historical co-investment ties, and
connecting each investor to the virtual target company node. The weight function ω : E → R + is defined as ω(A i , A j ) = n ij , where n ij is the number of past co-investments between A i and A j , and ω(C, A i ) = 1 for all A i ∈ V R .
We use the pre-trained Jena-ColBert 1 as the node encoder to embed each investor and the target company:
• Investor node embedding: h i = NodeEncoder(P i ).
• Company node embedding: h C = NodeEncoder(C).
These embeddings form the embedded graph G embed .
We then feed the embedded graph into a trained VGAT to infer interaction edges:
(10)
These interaction edges indicate which investor agents influence each other in the second round of decision-making.
Investor agents revise their initial decision D (0) i by incorporating the decisions and profiles of of their interaction neighbors. Specifically, each agent A i updates its decision D (0) i to a final decision D
(1) i as follows:
i represents agent A i ’s updated decision, informed by its own independent decision D (0) i , the initial judgment D (0) j of its interacting investor A j , and its profile P j . For startup success prediction, we define the overall success probability P success as the proportion of investor agents who ultimately decide to invest:
where δ i is an indicator function:
This final score reflects collective confidence across the investor agents, capturing both individual assessments and peer influence through the co-investment network.
This section presents the main results, with consistency analysis and interpretability details provided in the Appendix.
We have collected a global venture capital dataset from PitchBook2 , covering investment activities from 2005 to November 2023. The dataset includes investment detailssuch as invested companies, investor information, funding amounts, and financing rounds -as well as personal data (e.g., demographics of entrepreneurs and investors, including background, location, education) and company-level information (e.g., startup team composition, industry classifications, product keywords, company descriptions, and locations). In total, the dataset comprises 263,729 startups and 1,014,157 individuals.
For task evaluation, we select 2,507 startups that received their first round of financing between September 2021 and November 2022. We track whether each company secured subsequent funding within the following year. Among them, 533 startups received follow-on investment (positive samples), while 1,974 did not (negative samples). For each test case, we used all historical data available up to the time of its first round of financing to construct a stratified, time-aware company profile and a heterogeneous investor profile capturing relevant prior investment activity.
In binary classification tasks, standard metrics such as precision, recall, F1 score, and accuracy are commonly used. However, to better align with practical investor needs when choosing promising startups, we adopt Precision at K (P@K). P@K measures the proportion of successful startups among the top K recommendations, ranked by predicted confidence. This metric directly reflects the model’s utility in prioritizing high-potential investments and has been widely used in VC research (Sharchilev et al. 2018;Zhang et al. 2021;Lyu et al. 2021). To capture temporal stability and provide robust evaluation, we compute Average Precision at K (AP@K) by averaging P@K across monthly prediction windows. Higher AP@K values indicate stronger performance in selecting successful startups over time, making the model more actionable for real-world investment decisions.
In this study, the test dataset was constructed using data collected after September 2021. To minimize the risk of data leakage from LLMs, we conducted all experiments using GPT-3.5 3 , which was trained on data only up to September 1st, 2021. This ensures clear temporal isolation between the model’s training data and our test dataset, effectively avoiding potential leakage or contamination. To maintain the consistency in results, we fixed the temperature parameter to 0 across all LLM calls. In the main experiment, the number of candidate investors k was set to 10. All reported results are averaged over five independent runs, each incorporating random sampling in the selection of candidate investors.
To evaluate the effectiveness of our proposed method, we carefully select six methods as our benchmarks.
Random assigns values to unlabeled data randomly, based on the success rate observed in historical data.
BERT Fusion (Maarouf, Feuerriegel, and Pröllochs 2025): This method combines structured start-up variables with unstructured text, generates BERT embeddings, and uses them for classification tasks. 5. GNN-RAG (Mavromatis and Karypis 2024): Combines GNNs with RAG. The GNN generates node embeddings, while RAG retrieves shortest paths between similar nodes. These paths serve as contextual references for LLM.
SSFF (Wang, Ihlamur, and Alican 2025): A multiagent framework that decomposes startup information into isolated dimensions for independent analysis. It integrates all insights into a single centralized decision-making agent, which is different from our multi-decision-agent interaction.
The main experimental results are presented in Table 1. Notably, in our model’s judgment mechanism, if any investor agent decides to invest, the final output is marked as True -a behavior aligned with real-world scenarios. The proposed model significantly outperforms the baseline method across all evaluation metrics except recall, demonstrating its strong effectiveness in predicting the success of early-stage startups. In particular, our SimVC-CAS achieves significant improvements in AP@K metrics with AP@10 improving by 25.0%, AP@20 by 18.8%, and AP@30 by 13.1%. This demonstrates SimVC-CAS’s superior capability in ranking potentially successful startups. Notably, the AP@K gains increase as K decreases, indicating that the model has higher confidence in top-ranked results, a property particularly valuable in real-world decision-making. Further comparative analysis shows that graph-based modelsincluding both traditional GNN approaches (e.g., SHGMNN and GST) and LLM RAG-enhanced variants like GNN-RAG -consistently underperform relative to SimVC-CAS. This underscores the importance of explicitly modeling potential investors relationships surrounding startups. Additionally, the notably weaker performance of the traditional multiagent method SSFF further validates the advantage of incorporating multi-decision-maker interaction dynamics over single-decision mechanisms.
Furthermore, we find that while GNN methods (such as GST) achieve high recall (83.54%) through aggressive prediction, their precision drops significantly (21.75%). SimVC-CAS demonstrates overwhelming superiority in the most decision-critical AP@10 metric (37.52% vs 25.71%, a relative improvement of 46.1%), with precision and overall F1 scores also significantly outperforming GNN methods.
Impact of Key Components We also conducted ablation studies, as shown in Table 2, by removing key components of our model. Specifically, we carried out the dissolution of the Heterogeneous Investor Individual Modeling module and the Collective Interaction Modeling module, denoted as w/o roleplay and w/o interaction, respectively. Notably, the absence of the investor modeling module in w/o roleplay also eliminates any subsequent interaction modeling among investors. This variant essentially reflects the zero-short capability of large models without task-specific adaptation. The substantial performance drop observed in both models highlights the effectiveness and necessity of our architectural design.
Furthermore, we explored the impact of different interaction modeling strategies on model performance. The F ullInteraction variant implements an approach where all agents interact with each other, N etworkInteraction restricts interactions to only those investors directly connected within the co-investment network. Meanwhile, GAT Interaction replaces our VGAT-based interaction module with a standard GAT to capture interaction patterns. Interestingly, F ullInteraction performs worse than w/o interaction (33.64 F1 vs 35.62 F1), suggesting that indiscriminate interactions among all decisionmakers may introduce noise that hampers decisionmaking. On the other hand, N etworkInteraction outperforms both w/o interaction and F ullInteraction (35.87 F1 vs 35.62 F1 vs 33.64 F1), validating the effectiveness of leveraging structured co-investment relationships for interaction modeling in VC contexts. Finally, although GAT Interaction shows slight improvement over N etworkInteraction, it still falls short of the full . This suggests that while GNNs are suitable for modeling investor interactions, our VGAT-based approach more effectively captures the nuanced patterns within the coinvestment network. Impact of k To evaluate the impact of the number of candidate investors k on our method, we conducted experiments with k = 1, 10, 20, and 30. Meanwhile, we examined the performance differences across various interaction modes under different k values. As shown in Figure 3, which presents F1 score line graphs for each interaction mode across the tested k values, all interaction modes -except for the traditional interaction -demonstrate improved F1 as k increases. However, the rate of improvement diminishes significantly when k > 10. Based on these observations, we consider k = 10 to be a cost-effective choice: randomly selecting 10 investors around a startup provides a representative approximation of the broader investor group’s decision-making. Given that real-world VC decisions involve substantial capital and complex evaluations, the computational overhead introduced by this model is minimal in comparison. Therefore, scaling up the number of candidate investors emerges as a practical and effective strategy for enhancing model performance.
Furthermore, it was found that when k=1, the F1 is lower than w/o roleplay (26.36 vs 30.03). At this point, the model does not involve multi-role interaction and is essentially a few-shot learning with biases. This indicates that relying solely on the experience of a single investor for few-shot learning will introduce significant decision biases, resulting in the model’s performance being weaker than the zero-shot capability of the LLM.
From the perspective of investor groups, this study proposes SimVC-CAS, a collective agent system designed to simulate the decision-making dynamics of venture capital. Unlike traditional approaches that overlook the investor collective effect, SimVC-CAS captures the dynamic interactions among investors during the investment decision-making process. By doing so, it offers a novel framework for understanding the VC decision-making mechanism. While our primary application domain is venture capital, the framework of SimVC-CAS can theoretically be extended to other fields involving complex group decision-making dynamics.
Although this study focuses on modeling collective decision-making, its methodology can be combined with the multi-agent analysis perspective presented in previous works (Yu et al. 2024;Zhang et al. 2024b;Wang, Ihlamur, and Alican 2025;Griffin et al. 2025). This integration would enable the construction of a comprehensive framework that bridges both the “multi-agent analysis perspective” and the “collective agent decision-making perspective”. Finally, the current treatment of potential investor group selection in our model is relatively simplified. To more accurately simulate investment behavior, future work will explore the prediction and modeling of potential investor groups.
Our training objective is to utilize supervised learning to enable the model to capture the interaction patterns among investors. The data records the historical investment information of each startup company. We use its future joint investment relationship with investors as a supervisory signal -this is a stricter indicator for measuring investor interaction. Although this standard is rather strict and may lose some potential interaction signals, the main experimental results in the text show that this method is effective.
To ensure no time overlap with the main evaluation task (based on companies that received their first round of financing from October 2021 to November 2022), we selected the data of companies that received their first round of financing from October 2016 to September 2021 for training and evaluation. The company used needs to meet the condition of obtaining the next round of financing within the next year, which provides the required data on future joint investment relationships. The dataset is divided as follows: Training set: Companies that received their first round of financing from October 2016 to September 2019; Validation set: Companies that received their first round of financing from October 2019 to March 2020; Test set: Companies that received their first round of financing from April 2020 to September 2020.
For each eligible startup company, in order to be consistent with the main experiment, we randomly selected k = 10, 20, and 30 potential candidate investors respectively to construct three independent joint investment relationship prediction networks. In each network, if two investors connected by an edge do indeed jointly invest in the company within the next year, that edge is marked as a positive sample. Otherwise, it is marked as a negative sample.
Since the candidate investors are randomly selected, some networks may not contain any positive samples, and the proportion of positive samples in some networks may be too low. So we delete all networks where the ratio of positive to negative samples is less than 0.05. After filtering, the final scale of the obtained dataset is as follows: 1,992 networks in the training set, 247 networks in the validation set, and 222 networks in the test set. Among these retained networks, the ratio of positive to negative samples of edges is 0.2062.
To quantitatively evaluate our model, we assess its classification performance by computing the average Precision (P), average Recall (R), and average F1-score (F1) for graphs within each test set.
To highlight the rationality of the VGAT design through comparison, we adopted a two-layer GAT model as the baseline model for comparison. In this baseline model: 1. Virtual nodes are not introduced; 2. The embedding representation of company information is directly concatenated after the feature vectors of each investor node. 3. The characteristics of an edge are ultimately formed by the characteristics of the two associated investor nodes and the concatenate of the edge weights. In addition, the other structural designs and loss functions of the model are consistent with those of VGAT.
As shown in Table 3, VGAT demonstrates significant ad- The team of First Automation includes Yuya Murakami, Masao Ito and Shun Sakurai. Although I didn’t work directly with Yuya or Shun, Masao Ito has a reliable track record in investing in various technology companies, including Aidemy. The presence of experienced investors can be seen as a positive signal, but the core team appears to lack substantial professional backgrounds in the fields of automation and hardware, which is concerning. “First Automation” was established in 2020 and is in a relatively early stage in the automation industry. In the post-pandemic era, as enterprises pursue efficiency improvement, the market for automated services is growing. However, the company’s classification as “other hardware” may indicate that its market positioning is not clear enough, which could limit its growth potential. When comparing “First Automation” with the companies I have invested in (such as Aidemy and Cover Corporation), I find that my successful investments are mainly concentrated in the fields of software and educational technology rather than hardware. For instance, Aidemy has achieved success in the field of educational software, which is quite different from the hardware automation services that “First Automation” focuses on. This difference in industry focus might limit the applicability of my previous investment insights to this startup. Based on the limited experience in hardware investment, the core team’s insufficient expertise in automation and hardware, and the ambiguity of the market positioning of “First Automation”, I initially believe that the investment decision should be negative. Final investment decision: False
As shown in Figure 7, we present the design ideas of the prompts used in the model for initial decision-making and interactive corrective decision-making.
This is your personal profile: {investor profile}. Now it’s {The time of the first round of financing for start-up}, and you will play the role of {investor’s name}, conducting investment analysis for a start-up company and deciding wh ether to invest in it within the next year.
Mandatory Process:
——Phase 1: Declaration ——1. Self-introduction and profile-based capability assessment.
If lacking relevant experience: Explicitly state impact on decision validity.
——Phase 2: Investment Analysis Framework ——Evaluate startup risks using: A. Team Validation. B. Company fundamental assessment. C. Industry Window Analysis . D. Benchmarking analysis (Comparison with your past investments).
• Constraints : Follow-on investment within 12 months.
Target startup’s portrait: {startup’s portrait}.
In addition, we will also provide you with your initial analysis and decision as well as those of other investors for your reference.
Your last analysis and decision: {investor’s initial analysis and decision} Other investor’s profile, analysis and decision:
[{other investor’s profile, other investor’s analysis and decision}……]
For this evaluation, you must incorporate your past decisions and other investors’ choices and the level of other investors. If your current conclusion materially differs from historical patterns, explicitly explain newly identified risks or opportunities.