Science topic

Source Coding - Science topic

In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use.
Filters
All publications are displayed by default. Use this filter to view only publications with full-texts.
Publications related to Source Coding (10,000)
Sorted by most recent
Conference Paper
Full-text available
Graph Anomaly Detection (GAD) is a technique used to identify abnormal nodes within graphs, finding applications in network security, fraud detection, social media spam detection, and various other domains. A common method for GAD is Graph Auto-Encoders (GAEs), which encode graph data into node representations and identify anomalies by assessing th...
Article
Full-text available
A new approach for developing implicit composite time integration schemes is established starting with rational approximations of the matrix exponential in the solution of the equations of motion. The rational approximations are designed to have the same effective stiffness matrix in all sub-steps. An efficient algorithm is devised so that the impl...
Article
Full-text available
In image processing, computer vision algorithms are applied to regions bounded by closed contours. These contours are often irregular, poorly defined, and contain holes or unavailable areas inside. A common problem in computational geometry includes finding the k-sided polygon (k-gon) of maximum area or maximum perimeter inscribed within a contour....
Article
Full-text available
Due to the heterogeneity of a large amount of real-world data, meta-paths are widely used in recommendation. Such recommendation methods can represent composite relationships between entities, but cannot explore reliable relations between nodes and influence among meta-paths. For solving this problem, a Community Aware Graph Embedding Learning meth...
Article
Full-text available
Background Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expression data. Results In this technical note, we prese...
Article
Full-text available
Drug repositioning is a faster and more affordable solution than traditional drug discovery approaches. From this perspective, computational drug repositioning using knowledge graphs is a very promising direction. Knowledge graphs constructed from drug data and information can be used to generate hypotheses (molecule/drug - target links) through li...
Preprint
Full-text available
Ionospheric electrodynamics is a problem of mechanical stress balance mediated by electromagnetic forces. Joule heating (the total rate of frictional heating of thermospheric gases and ionospheric plasma) and ionospheric Hall and Pedersen conductances comprise three of the most basic descriptors of this problem. More than half a century after ident...
Article
Full-text available
In this work, we established, validated, and optimized a novel computational framework for tracing arbitrarily oriented actin filaments in cryo-electron tomography maps. Our approach was designed for highly complex intracellular architectures in which a long-range cytoskeleton network extends throughout the cell bodies and protrusions. The irregula...
Conference Paper
Full-text available
Code review is a common type of peer review in Computer Science (CS) education. It's a peer review process that involves CS students other than the original author examining source code and is widely acknowledged as an effective method for reducing software errors and enhancing the overall quality of software projects. While code review is an essen...
Preprint
Full-text available
To obtain more results in the field of theoretical models of the multiagent systems it is necessary to step outside of the research done on the paper and move forward to the computer simulation. The simulation opens the possibilities to study the behavior of studied model in real-time, or verify designed configurations. To develop an application of...
Article
Full-text available
The evolution of pesticide resistance is a widespread problem with potentially severe consequences for global food security. We introduce the resevol R package, which simulates individual-based models of pests with evolving genomes that produce complex, polygenic, and covarying traits affecting pest life history and pesticide resistance. Simulation...
Chapter
Full-text available
A powerful mechanism for detecting anomalies in a self-supervised manner was demonstrated by model training on normal data, which can then be used as a baseline for scoring anomalies. Recent studies on diffusion models (DMs) have shown superiority over generative adversarial networks (GANs) and have achieved better-quality sampling over variational...
Article
Full-text available
Current predictors of DNA-binding residues (DBRs) from protein sequences belong to two distinct groups, those trained on binding annotations extracted from structured protein-DNA complexes (structure-trained) vs. intrinsically disordered proteins (disorder-trained). We complete the first empirical analysis of predictive performance across the struc...
Preprint
Full-text available
Decentralized Finance (DeFi) uses blockchain technologies to transform traditional financial activities into decentralized platforms that run without intermediaries and centralized institutions. Smart contracts are programs that run on the blockchain, and by utilizing smart contracts, developers can more easily develop DeFi applications. Some key f...
Article
Full-text available
Background. Taxonomic identification through DNA barcodes gained considerable traction through the invention of next-generation sequencing and DNA metabar-coding. Metabarcoding allows for the simultaneous identification of thousands of organisms from bulk samples with high taxonomic resolution. However, reliable identifications can only be achieved...
Article
Full-text available
Artificial intelligence (AI) has been rapidly developing since the mid-20 th century. Today, AI is deployed in many aspects of our daily lives and is increasingly ubiquitous. At the same time, concerns about AI, including discrimination, privacy and security, have prompted calls for greater regulation. To this end, regulators may seek access to AI'...
Preprint
Full-text available
Electrocardiographic (ECG) signals that monitor heart activity can help identifying disease-related anomalies. Reliable automatic anomaly detection has been shown to be useful in supporting physicians in reading ECG signals. Decision support systems may be useful in such cases but their reliability can be guaranteed. Autoencoders (AEs) have been...
Preprint
Full-text available
Researchers often need to synthesize genes of interest in this era of synthetic biology. Gene synthesis by PCR assembly of multiple DNA fragments is a quick and economical method that is widely applied. Up to now, there have been a few software solutions for designing fragments in gene synthesis. However, some of these software solutions use progra...
Preprint
Full-text available
Gentoo Linux is a highly customizable and performance-oriented Linux distribution known for its unique features and flexibility. It is a source-based distribution, which means that software is compiled locally from source code rather than using pre-compiled binaries. This allows users to optimize their system and customize it to their specific need...
Article
Full-text available
Deep hashing networks have been successful in retrieving interesting images from massive remote sensing images. There is no doubt that security and reliability are critical in remote sensing image retrieval. Recent studies about natural image retrieval have shown the vulnerability of deep hashing networks to adversarial examples, but there do not e...
Article
Full-text available
Abstractss The PHYAFB database is a valuable resource for studying the physiological demands of female amateur basketball players during high-stress official games. It contains heart rate data from ten players aged 18 to 26, collected during ten crucial relegation phase matches, with 348,232 HR samples in CSV and Excel formats for easy access and a...
Article
Full-text available
Tracing groundwater chemistry requires broad and continuous sampling campaigns to keep track of groundwater deterioration. The need is pressing in coastal aquifers due to high water demand to meet rapid population growth. This paper uses artificial intelligence (AI) to develop an algorithm capable of predicting seven major chemical ions in water (C...
Article
Full-text available
Scene understanding based on image segmentation is a crucial component of autonomous vehicles. Pixel-wise semantic segmentation of RGB images can be advanced by exploiting complementary features from the supplementary modality (X-modality). However, covering a wide variety of sensors with a modality-agnostic model remains an unresolved problem due...
Article
Full-text available
The structure of an RNA, and even more so the interactions it forms with other RNAs, provide valuable information about its function. Secondary structure-based tools for RNA-RNA interaction prediction provide a quick way to identify possible interaction targets and structures. However, these tools ignore the effect of steric hindrance on the tertia...
Preprint
Full-text available
In this paper, we introduce a framework of symmetry-preserving multimodal pretraining to learn a unified representation on proteins in an unsupervised manner that can take into account primary and tertiary structures. For each structure, we propose the corresponding pretraining method on sequence, graph and 3D point clouds based on large language m...
Article
Full-text available
Class imbalance (CI) is a well-known problem in data science. Nowadays, it is affecting the data modeling of many of the real-world processes that are being digitized. The manufacturing industry turns out to be highly affected by this problem, especially in fault inspection, prediction or monitoring processes, and in all those processes where the p...
Article
Full-text available
Background A Lactobacillus-dominated vaginal microbiome provides the first line of defense against adverse genital tract health outcomes. However, there is limited understanding of the mechanisms by which the vaginal microbiome modulates protection, as prior work mostly described its composition through morphologic assessment and marker gene sequen...
Article
Full-text available
Recently, deep convolutional neural networks (CNNs) have achieved significant advancements in single image demoiréing. However, most of the existing CNN-based demoiréing methods require excessive memory usage and computational cost, which considerably limit to apply on mobile devices. Additionally, most CNN-based methods employ expensive approaches...
Article
Full-text available
One of the most time-consuming tasks for developers is the comprehension of new code bases. An effective approach to aid this process is to label source code files with meaningful annotations, which can help developers understand the content and functionality of a code base quicker. However, most existing solutions for code annotation focus on proj...
Article
Full-text available
RGB-D salient object detection (RGB-D SOD) has currently attracted much attention for its prospect of broad application. On the basis of the “encoder-decoder” paradigm of the fully convolutional network (FCN), many FCN-based strategies have emerged and achieved huge progress, but underestimated the potential of level-specific characteristics of mul...
Preprint
Full-text available
Use of fingerprints found at a crime scene is a common practice for identifying suspects in criminal investigations. Over the past two decades, attempts have been made to obtain additional information from fingerprints, beyond locating suspects as part of an investigation. This includes gender, age and nationality. Researchers demonstrated 75%-90%...
Article
Full-text available
Recent years have witnessed significant advancements in deep learning-based 3D object detection, leading to its widespread adoption in numerous applications. As 3D object detectors become increasingly crucial for security-critical tasks, it is imperative to understand their robustness against adversarial attacks. This paper presents the first compr...
Article
Full-text available
This article scrutinizes the escalating apprehensions surrounding algorithmic transparency, positing it as a pivotal facet for ethics and accountability in the development and deployment of artificial intelligence (AI) systems. By delving into legislative and regulatory initiatives across various jurisdictions, the article discerns how different co...
Preprint
Full-text available
Motivation The prediction of RNA structure canonical base pairs from a single sequence, especially pseudoknotted ones, remains challenging in a thermodynamic models that approximates the energy of the local 3D motifs joining canonical stems. It has become more and more apparent in recent years that the structural motifs in the loops, composed of no...
Article
Full-text available
Per- and polyfluoroalkyl substances (PFAS) are a huge group of anthropogenic chemicals with unique properties that are used in countless products and applications. Due to the high stability of their C-F bonds, PFAS or their transformation products (TPs) are persistent in the environment, leading to ubiquitous detection in various samples worldwide....
Conference Paper
Full-text available
We consider the problem of program clone search, i.e. given a target program and a repository of known programs (all in executable format), the goal is to find the program in the repository most similar to the target program -- with potential applications in terms of reverse engineering, program clustering, malware lineage and software theft detect...
Article
Full-text available
Different abnormalities are commonly encountered in computer network systems. These types of abnormalities can lead to critical data losses or unauthorized access in the systems. Buffer overflow anomaly is a prominent issue among these abnormalities, posing a serious threat to network security. The primary objective of this study is to identify the...
Article
Full-text available
Spatial keyword query is a classical query processing mode for spatio-textual data, which aims to provide users the spatio-textual objects with the highest spatial proximity and textual similarity to the given query. However, the top-k result objects obtained by using the spatial keyword query mode are often similar to each other, while users hope...
Preprint
Full-text available
Effective identification of differentially expressed genes (DEGs) has been challenging for single-cell RNA sequencing (scRNA-seq) profiles. Many existing algorithms have high false positive rates (FPRs) and often fail to identify weak biological signals. Here, we present a novel method for identifying DEGs in scRNA-seq data called RankCompV3. It is...
Article
Full-text available
Context A reproducible build occurs if, given the same source code, build instructions, and build environment (i.e., installed build dependencies), compiling a software project repeatedly generates the same build artifacts. Reproducible builds are essential to identify tampering attempts responsible for supply chain attacks, with most of the resear...
Preprint
Full-text available
Latex is a free computer program for typesetting text, mathematical formulas, and graphic presentations. The Tikz package of Latex is the most complex and powerful tool for creating graphical elements. Atomic orbitals ψnlm (n = 1 − 4) of the electron in a hydrogen atom are presented by 3D surfaces and 2D color mapping contour plots by Latex. We pro...
Article
Full-text available
Even though human experience unfolds continuously in time, it is not strictly linear; instead, it entails cascading processes building hierarchical cognitive structures. For instance, during speech perception, humans transform a continuously varying acoustic signal into phonemes, words, and meaning, and these levels all have distinct but interdepen...
Article
Full-text available
Motivation: Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features...
Article
Full-text available
Urban mobility contributes significantly to greenhouse gas emissions and comes with negative social impacts for various groups, such as limited accessibility to opportunity or basic services. Transitions towards sustainable and people-centred urban mobility systems are paramount. Yet, this is accompanied by various challenges. Complex urban systems...
Article
Full-text available
Background Aptamers, which are biomaterials comprised of single-stranded DNA/RNA that form tertiary structures, have significant potential as next-generation materials, particularly for drug discovery. The systematic evolution of ligands by exponential enrichment (SELEX) method is a critical in vitro technique employed to identify aptamers that bin...
Article
Full-text available
Recent methods have been proposed to produce automatic rhyme annotators for large rhymed corpora. These methods, such as Baley (2022b) greatly reduce the cost of annotating rhymed material, allowing historical linguists to focus on the analysis of the rhyme patterns. However, evidence for the quality of those annotations has been anecdotal, consist...
Article
Full-text available
Background Standardized Antimicrobial Administration Ratios or SAARs are representations of antimicrobial use data and were first provided to hospitals voluntarily participating in the National Health Safety Network Antimicrobial Use (NHSN AU) option in 2015. Submission to the AU module will be mandatory per CMS effective 2024. AU module data is re...
Article
Full-text available
DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We prese...
Chapter
Full-text available
Spectrum analysis systems in online water quality testing are designed to detect types and concentrations of pollutants and enable regulatory agencies to respond promptly to pollution incidents. However, spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments. To make the analysis model ap...
Article
Full-text available
Automation and human-robot collaboration are increasing in modern workplaces such as industrial manufacturing. Nowadays, humans rely heavily on advanced robotic devices to perform tasks quickly and accurately. Modern robots with computer vision and artificial intelligence are gaining attention and popularity rapidly. This paper demonstrates how a r...
Article
Full-text available
The purpose of imbalanced data classification is to solve the problem of unfair learning caused by the large difference in data distribution. Traditional classifiers are designed on the basis of balanced data, but the performance of imbalanced data will decline sharply. Therefore, balancing the majority class and minority class samples before class...
Article
Full-text available
Abstract Meaning Representation (AMR) parsing aims to translate sentences to semantic AMR graphs and has recently been empowered by pre-trained Transformer models (e.g., BART). We argue that explicitly encoding syntactic knowledge is beneficial for AMR parsing, since the AMR graph of a sentence has similar substructures to those of its correspondin...
Conference Paper
Full-text available
Fantasy Sports has a current market size of $27B and is expected to grow more than $84B in less than a decade. The intent is to create virtual teams that somehow reflect what would happen if the constituent players actually played in a team. Using individual player and team statistics, models can be trained to predict an outcome. But fans are left...
Article
Full-text available
The approach described in this paper handles the parameters and characteristics (analog and discrete ones) of a Buck DC-DC converter (in its power and control parts) in a common manner. The usage of probably complicated differential equations for discrete dynamical systems is avoided by means of index matrix equations, which can be easily understoo...
Article
Full-text available
Single-cell RNA sequencing (scRNA-seq) technology studies transcriptome and cell-to-cell differences from higher single-cell resolution and different perspectives. Despite the advantage of high capture efficiency, downstream functional analysis of scRNA-seq data is made difficult by the excess of zero values (i.e., the dropout phenomenon). To effec...
Preprint
Full-text available
Numerical models are a powerful tool for investigating the dynamic processes in the interior of the Earth and other planets, but the reliability and predictive power of these discretized models depends on the numerical method, as well as an accurate representation of material properties in space and time. In the specific context of geodynamic model...
Article
Full-text available
Human leukocyte antigen (HLA) is closely involved in regulating the human immune system. Despite great advance in detecting classical HLA Class I binders, there are few methods or toolkits for recognizing non-classical HLA Class I binders. To fill in this gap, we have developed a deep learning-based tool called DeepHLAPred. The DeepHLAPred used ele...
Article
Full-text available
Quantifying genetic clusters (=populations) from genotypic data is a fundamental, but non-trivial task for population geneticists that is compounded by: hierarchical population structure, diverse analytical methods, and complex software dependencies. AdmixPipe v3 ameliorates many of these issues in a single bioinformatic pipeline that facilitates a...
Preprint
Full-text available
Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in or...
Article
Full-text available
Aim Despite a vast number of articles on radiomics and machine learning in positron emission tomography (PET) imaging, clinical applicability remains limited, partly owing to poor methodological quality. We therefore systematically investigated the methodology described in publications on radiomics and machine learning for PET-based outcome predict...
Preprint
Full-text available
NanoLocz is an open-source computer program designed for high-throughput automatic processing and single-particle analysis of Atomic Force Microscopy (AFM) image data. High-Speed AFM and Localization AFM (LAFM) enable the study of single molecules with increasingly higher spatiotemporal resolution. However, efficient and rapid analysis of the image...
Article
Full-text available
An API is a contract between pieces of applications serving the main software once integrated into the source code of the main application. These pieces of applications communicate with each other in the language they both understand and over a network if needed (Jacobson, et al., 2012). It grants the functionality of one application to another whe...
Preprint
Full-text available
p>Can the smart radio environment paradigm measurably enhance the performance of contemporary urban macrocells? In this study, we explore the impact of reconfigurable intelligent surfaces (RISs) on a real-world sub-6 GHz MIMO channel. A rooftop-mounted macrocell antenna has been adapted to enable frequency domain channel measurements to be ascertai...
Preprint
Full-text available
p>Can the smart radio environment paradigm measurably enhance the performance of contemporary urban macrocells? In this study, we explore the impact of reconfigurable intelligent surfaces (RISs) on a real-world sub-6 GHz MIMO channel. A rooftop-mounted macrocell antenna has been adapted to enable frequency domain channel measurements to be ascertai...
Technical Report
Full-text available
The ProjectGenius initiative embodies an innovative approach to project assistance management through the development of a comprehensive website. This platform aims to streamline the process of project discovery and facilitate efficient access to project resources by offering a centralized hub for users seeking diverse project ideas and correspondi...
Article
Full-text available
One of the significant features of software quality is software reliability. In the testing phase, faults are identified and corrected by integrating them into software development, thus obtaining better reliability. Here, by utilizing the Elliptical Distributions-centric Emperor Penguins Colony Algorithm (ED-EPCA)-based Test Case Prioritization (T...
Preprint
Full-text available
A new exploratory technique called biarchetype analysis is defined. We extend archetype analysis to find the archetypes of both observations and features simultaneously. The idea of this new unsupervised machine learning tool is to represent observations and features by instances of pure types (biarchetypes) that can be easily interpreted as they a...