Innopolis University
Recent publications
Математические и имитационные модели кинетики роста микроорганизмов в определенных внешних условиях представляют основу для диагностики, контроля и прогнозирования химических субстанций, характеризующих состояние систем различной природы. В работе представлен подход к моделированию и численной спецификации структурно-динамических характеристик формирования паттернов при поверхностном культивировании бактерий на питательных средах. Математическая модель формализуется в виде начально-краевой задачи для системы реакционно-диффузионных уравнений, определяющих концентрации биомассы бактерий и питательного субстрата. Для модельного описания натуралистических паттернов введена стохастическая процедура эволюционной деформации популяции бактерий с колонизационным потенциалом и образованием колоний с различным време нем инкубации. Алгоритм численного решения нелинейной дифференциальной задачи сконструирован с использованием конечно-разностной схемы Яненко, дополненной итерационной процедурой. Методика оценки геометрических характеристик бактериальных паттернов основана на расчете фрактальной размерности границ кластерной структуры. Программная реализация алгоритмов проведена в пакете прикладных программ Matlab. Представлены результаты серии вычислительных экспериментов по визуализации пространственно-временных распределений биомассы бактерий и питательного субстрата при вариации управляющих параметров гибридной модели. Выявлены закономерности формирования паттернов бактериальных колоний ветвящегося морфологического типа в зависимости от изменения уровня начальной концентрации питания, диффузионного параметра и параметра стохастического роста. Установлено, что увеличение начальной концентрации питания приводит к геометрическому фазовому переходу - трансформации структуры от дендрита, через нерегулярный паттерн с цветочным узором, к однородному кластеру с достаточно гладкой границей.
The rapid development of machine learning and deep learning has introduced increasingly complex optimization challenges that must be addressed. Indeed, training modern, advanced models has become difficult to implement without leveraging multiple computing nodes in a distributed environment. Distributed optimization is also fundamental to emerging fields such as federated learning. Specifically, there is a need to organize the training process to minimize the time lost due to communication. A widely used and extensively researched technique to mitigate the communication bottleneck involves performing local training before communication. This approach is the focus of our paper. Concurrently, adaptive methods that incorporate scaling, notably led by Adam, have gained significant popularity in recent years. Therefore, this paper aims to merge the local training technique with the adaptive approach to develop efficient distributed learning methods. We consider the classical Local SGD method and enhance it with a scaling feature. A crucial aspect is that the scaling is described generically, allowing us to analyze various approaches, including Adam, RMSProp, and OASIS, in a unified manner. In addition to the theoretical analysis, we validate the performance of our methods in practice by training a neural network. Bibliography: 49 titles.
Modern realities and trends in learning require more and more generalization ability of models, which leads to an increase in both models and training sample size. It is already difficult to solve such tasks in a single device mode. This is the reason why distributed and federated learning approaches are becoming more popular every day. Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy. One of the most well-known approaches to combat communication costs is to exploit the similarity of local data. Both Hessian similarity and homogeneous gradients have been studied in the literature, but separately. In this paper we combine both of these assumptions in analyzing a new method that incorporates the ideas of using data similarity and clients sampling. Moreover, to address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method. The theory is confirmed by training on real datasets. Bibliography: 45 titles.
The general theory of greedy approximation with respect to arbitrary dictionaries is well developed in the case of real Banach spaces. Recently, some of results proved for the Weak Chebyshev Greedy Algorithm (WCGA) in the case of real Banach spaces were extended to the case of complex Banach spaces. In this paper we extend some of the results known in the real case for greedy algorithms other than the WCGA to the case of complex Banach spaces. Bibliography: 25 titles.
Distributed optimization plays an important role in modern large-scale machine learning and data processing systems by optimizing the utilization of computational resources. One of the classical and popular approaches is Local Stochastic Gradient Descent (Local SGD), characterized by multiple local updates before averaging, which is particularly useful in distributed environments to reduce communication bottlenecks and improve scalability. A typical feature of this method is the dependence on the frequency of communications. But in the case of a quadratic target function with homogeneous data distribution over all devices, the influence of the frequency of communications vanishes. As a natural consequence, subsequent studies include the assumption of a Lipschitz Hessian, as this indicates the similarity of the optimized function to a quadratic one to a certain extent. However, in order to extend the completeness of Local SGD theory and unlock its potential, in this paper we abandon the Lipschitz Hessian assumption by introducing a new concept of approximate quadraticity. This assumption gives a new perspective on problems that have near quadratic properties. In addition, existing theoretical analyses of Local SGD often assume a bounded variance. We, in turn, consider the unbounded noise condition, which allows us to broaden the class of problems under study. Bibliography: 36 titles.
Hyperspectral Imaging (HSI) has proven to be a powerful tool for capturing detailed spectral and spatial information across diverse applications. Despite the advancements in Deep Learning (DL) and Transformer architectures for HSI classification, challenges such as computational efficiency and the need for extensive labeled data persist. This paper introduces WaveMamba, a novel approach that integrates wavelet transformation with the spatial-spectral Mamba architecture to enhance HSI classification. WaveMamba captures both local texture patterns and global contextual relationships in an end-to-end trainable model. The Wavelet-based enhanced features are then processed through the state-space architecture to model spatial-spectral relationships and temporal dependencies. The experimental results indicate thatWaveMamba surpasses existing models, achieving an accuracy improvement of 4.5% on the University of Houston dataset and a 2.0% increase on the Pavia University dataset. The source code will be made public at https://github.com/mahmad00.
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human-established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
Over the past few decades, a variety of significant scientific breakthroughs have been achieved in the fields of brain encoding and decoding using the functional magnetic resonance imaging (fMRI). Many studies have been conducted on the topic of human brain reaction to visual stimuli. However, the relationship between fMRI images and video sequences viewed by humans remains complex and is often studied using large transformer models. In this paper, we investigate the correlation between videos presented to participants during an experiment and the resulting fMRI images. To achieve this, we propose a method for creating a linear model that predicts changes in fMRI signals based on video sequence images. A linear model is constructed for each individual voxel in the fMRI image, assuming that the image sequence follows a Markov property. Through the comprehensive qualitative experiments, we demonstrate the relationship between the two time series. We hope that our findings contribute to a deeper understanding of the human brain’s reaction to external stimuli and provide a basis for future research in this area.
In this work, we propose a new text complexity formula aimed at assessing the complexity of Russian school textbooks. We used the annotated Russian Academic Corpus containing over 5 million tokens as the training and validation data and employed machine learning methods in the study. The values of 4 parameters in each of the 154 texts used for the research were measured with the help of the tools from the Spacy library. Comparative analysis of the new and existing complexity formulas suggests that the differences between them are indicative and the new formulas provide more accurate results. This research advances our understanding of the interdependency between frequency and text complexity and provides a framework for effective implementation of lexical frequency patterns in discourse complexity studies. The findings can be implemented by textbooks writers and test developers to select and modify texts for specific categories of readers. Other areas of application include website design, surveys, and semantic analysis of social networks.
Frameworks are one of the most relevant tools for designing educational products today. Despite the various sources devoted to the definition of the framework, it should be noted that there is no unified understanding of this concept. In the Russian educational field, the framework is used as a tool for solving various kinds of tasks. The working concept of a "Methodological framework" is to define it as a ready-made methodological tool that allows you to systematically solve highly specialized tasks in the design of educational products. The paper describes the stages of designing a methodological framework, which includes the following steps: preparatory, basic and final. Research objectives to be solved: to formulate the concept of a methodological framework for use in the design of an educational product; to give a comparative characteristic between a methodological framework and a methodological recommendation; to describe the main elements of a methodological framework taking into account a systematic approach; to demonstrate the results and advantages of using a methodological framework. The article describes the experience of implementing a methodological framework within the framework of the "Creative Workshop Framlab" in the format of a telegram group with more than 455 participants. The conclusions made make it possible to contribute to the practice of using a methodological framework for designing educational products, as well as to the development of domestic pedagogical design.
Over the last 25 years, a considerable proliferation of software metrics and a plethora of tools have emerged to extract them. While this is indeed positive concerning the previous situations of limited data, it still leads to a significant problem arising both from a theoretical and a practical standpoint. From a theoretical perspective, several metrics are likely to result in collinearity, overfitting, etc. From a practical perspective, such a set of metrics is difficult to manage and companies, especially small ones, may feel overwhelmed and unable to select a viable subset of them. Still, so far it has not been fully understood what is a viable subset of metrics suitable to properly manage software projects and products. In this paper, we attempt to address this issue. We focus on the case of programs written in Java and we consider classes and methods. We use Sammon error as a measure of the similarity of metrics. Utilizing both Particle Swarm Optimization and Genetic Algorithm, we adapted a method for the identification of a viable subset of such metrics that could solve the mentioned problem. Furthermore, we experiment with our approach on 800 projects coming from GitHub and validate the results on 200 projects. With the proposed method we got optimal subsets of software engineering metrics. These subsets gave us low values of Sammon error at more than 70% at class and method levels on a validation dataset.
In this paper, we study the standard formulation of an optimization problem when the computation of gradient is not available. Such a problem can be classified as a “black box” optimization problem, since the oracle returns only the value of the objective function at the requested point, possibly with some stochastic noise. Assuming convex, and higher-order of smoothness of the objective function, this paper provides a zero-order accelerated stochastic gradient descent (ZO-AccSGD) method for solving this problem, which exploits the higher-order of smoothness information via kernel approximation. As theoretical results, we show that the ZO-AccSGD algorithm proposed in this paper improves the convergence results of state-of-the-art (SOTA) algorithms, namely the estimate of iteration complexity. In addition, our theoretical analysis provides an estimate of the maximum allowable noise level at which the desired accuracy can be achieved. Validation of our theoretical results is demonstrated both on the model function and on functions of interest in the field of machine learning. We also provide a discussion in which we explain the results obtained and the superiority of the proposed algorithm over SOTA algorithms for solving the original problem.
Traffic congestion continues to pose a significant challenge in urban environments, necessitating innovative approaches to traffic management. This paper explores the application of Quantum Annealing (QA) for real-world traffic optimization, expanding on the pioneering work of Volkswagen and D-Wave. In 2017, a collaborative team demonstrated the potential of QA to optimize traffic flow by solving a complex Quadratic Unconstrained Binary Optimization (QUBO) problem involving 418 cars, which required 1,254 qubits. Later, this research culminated in a pilot project at the Web Summit conference in Lisbon, one of Europe’s largest technology events, showcasing quantum computing-based traffic optimization. Since the QPU alone could not directly handle the full problem size, the team employed a hybrid classical-quantum approach, leading to significant improvements in traffic distribution. This paper builds on that foundation by investigating potential speedups using a purely quantum approach, particularly by utilizing the QPU for smaller QUBO problems. The proposed method (MTF) enhances traffic management by decomposing the overall optimization problem into smaller, more manageable subproblems. This decomposition enables us to harness the advantages of the QPU while tackling more complex traffic scenarios that previous approaches struggled to manage. By breaking the problem into smaller parts, we mitigate the challenges associated with embedding large-scale problems into the QPU, which often presents computational difficulties. To evaluate our approach, we conducted experiments involving 100, 200, 300, 400, and 500 cars on a complex traffic map featuring multiple start and end points. We successfully embedded the problem into the D-Wave Advantage Quantum Processing Unit, utilizing the "Pegasus" topology, which resulted in a significant acceleration of the solution process. The experiment results show improved speed and effectiveness in real-world scenarios by leveraging the QPU for better traffic optimization.
Disjoint sampling is critical for rigorous and unbiased evaluation of state-of-the-art (SOTA) models e.g., Attention Graph and Vision Transformer. When training, validation, and test sets overlap or share data, it introduces a bias that inflates performance metrics and prevents accurate assessment of a model’s true ability to generalize to new examples. This paper presents an innovative disjoint sampling approach for training SOTA models for the Hyperspectral Image Classification (HSIC). By separating training, validation, and test data without overlap, the proposed method facilitates a fairer evaluation of how well a model can classify pixels it was not exposed to during training or validation. Experiments demonstrate the approach significantly improves a model’s generalization compared to alternatives that include training and validation data in test data (A trivial approach involves testing the model on the entire Hyperspectral dataset to generate the ground truth maps. This approach produces higher accuracy but ultimately results in low generalization performance). Disjoint sampling eliminates data leakage between sets and provides reliable metrics for benchmarking progress in HSIC. Disjoint sampling is critical for advancing SOTA models and their real-world application to large-scale land mapping with Hyperspectral sensors. Overall, with the disjoint test set, the performance of the deep models achieves 96.36% accuracy on Indian Pines data, 99.73% on Pavia University data, 98.29% on University of Houston data, 99.43% on Botswana data, and 99.88% on Salinas data.
Stochastic first-order methods are standard for training large-scale machine learning models. Random behavior may cause a particular run of an algorithm to result in a highly suboptimal objective value, whereas theoretical guarantees are usually proved for the expectation of the objective value. Thus, it is essential to theoretically guarantee that algorithms provide small objective residuals with high probability. Existing methods for non-smooth stochastic convex optimization have complexity bounds with the dependence on the confidence level that is either negative-power or logarithmic but under an additional assumption of sub-Gaussian (light-tailed) noise distribution that may not hold in practice. In our paper, we resolve this issue and derive the first high-probability convergence results with logarithmic dependence on the confidence level for non-smooth convex stochastic optimization problems with non-sub-Gaussian (heavy-tailed) noise. To derive our results, we propose novel stepsize rules for two stochastic methods with gradient clipping. Moreover, our analysis works for generalized smooth objectives with Hölder-continuous gradients, and for both methods, we provide an extension for strongly convex problems. Finally, our results imply that the first (accelerated) method we consider also has optimal iteration and oracle complexity in all the regimes, and the second one is optimal in the non-smooth setting.
One of the core problems in millimeter wave (mmWave) massive multiple-input-single-output (MISO) communication systems, which significantly affects the data rate, is the misalignment of the beam direction of the transmitter towards the receiver. In this paper, we investigate strategies that identify the best beam within a fixed duration of time. To this end, we develop an algorithm, named Unimodal Bandit for Best Beam (UB3), that exploits the unimodal structure of the mean received signal strength as a function of the available beams and identifies the best beam within a fixed time duration using pure exploration strategies. We derive an upper bound on the probability of misidentifying the best beam, and we prove that the upper bound is of the order O(log2Kexp{αnA})\mathcal {O}\left ({{\log _{2} K\exp \left \{{{-\alpha n A}}\right \}}}\right) , where K is the number of beams, A is a problem-dependent constant, and αn\alpha n is the number of pilots used in the channel estimation phase. In contrast, when the unimodal structure is not exploited, the error probability is of order O(log2Kexp{αnA/(KlogK)})\mathcal {O}\left ({{\log _{2} K\exp \left \{{{-\alpha n A/(K\log K)}}\right \}}}\right) . Thus, by exploiting the unimodal structure, we achieve a much better error probability, which depends only logarithmically on K. We demonstrate that UB3 outperforms the state-of-the-art algorithms through extensive simulations.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
994 members
Nikolay Shilov
  • Institute of Information Systems
Ilya Afanasyev
  • Faculty of Computer Science and Engineering
Manuel Mazzara
  • Department of Computer Science and Engineering
Information
Address
Kazan, Russia
Head of institution
Kirill Semenikhin