
Ahmed Amer- Santa Clara University
Ahmed Amer
- Santa Clara University
About
60
Publications
7,723
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
771
Citations
Current institution
Publications
Publications (60)
INTRODUCTION: The increased ability of Artificial Intelligence (AI) technologies to generate and parse texts will inevitably lead to more proposals for AI’s use in the semantic sentiment analysis (SSA) of textual sources. We argue that instead of focusing solely on debating the merits of automated versus manual processing and analysis of texts, it...
It is important to design digital infrastructure that can better accommodate multicultural and pluralistic views from its foundations. It is insufficient to look at only the responses and influences of culture on technology without considering how the technology can be adapted in anticipation of, and to support, pluralistic multicultural perspectiv...
Advancements in digital technology have eased the process of gathering, generating, and altering digital data at large scale. The sheer scale of the data necessitates the development and use of smaller secondary data structured as ‘indices,’ which are typically used to locate desired subsets of the original data, thereby speeding up data referencin...
The increased ability of Artificial Intelligence (AI) technologies to generate and parse texts will inevitably lead to more proposals for AI’s use in the semantic sentiment analysis (SSA) of textual sources. We argue that instead of focusing solely on debating the merits of automated versus manual processing and analysis of texts, it is critical to...
This book constitutes the refereed proceedings of the 12th International Conference on Intelligent Technologies for Interactive Entertainment, INTETAIN 2020. Due to COVID-19 pandemic the conference was held virtually.
The 19 full papers were selected from 49 submissions and present novel, and innovative work in areas including in art, science, desi...
One way to increase storage density is using a shingled magnetic recording (SMR) disk. We propose a novel use of SMR disks with RAID (redundant array of independent disks) arrays, specifically building upon and compared with a basic RAID 4 arrangement. The proposed scheme (called RAID 4SMR) has the potential to improve the performance of a traditio...
Two-dimensional square RAID arrays organize their data disks in such a way that each of them belongs to exactly one row parity stripe and one column parity stripe. Even so, they remain vulnerable to the loss of any given data disk and the parity disks of its two stripes. We show how to eliminate all but one of these fatal triple failures by entangl...
Stored data needs to be protected against device failure and irrecoverable sector read errors, yet doing so at exabyte scale can be challenging given the large number of failures that must be handled. We have developed RESAR (Robust, Efficient, Scalable, Autonomous, Reliable) storage, an approach to storage system redundancy that only uses XOR-base...
Shingled Magnetic Recording (SMR) is a means of increasing the density of hard drives that brings a new set of challenges. Due to the nature of SMR disks, updating in place is not an option. Holes left by invalidated data can only be filled if the entire band is reclaimed, and a poor band compaction algorithm could result in spending a lot of time...
As the prices of magnetic storage continue to decrease, the cost of replacing
failed disks becomes increasingly dominated by the cost of the service call
itself. We propose to eliminate these calls by building disk arrays that
contain enough spare disks to operate without any human intervention during
their whole lifetime. To evaluate the feasibili...
Disk failure rates vary so widely among different makes and models that designing storage solutions for the worst case scenario is a losing proposition. The approach we propose here is to design our storage solutions for the most probable case while incorporating in our design the option of adding extra redundancy when we find out that its disks ar...
As we move towards data centers at the exascale, the reliability challenges of such enormous storage systems are daunting. We demonstrate how such systems will suffer substantial annual data loss if only traditional reliability mechanisms are employed. We argue that the architecture for exascale storage systems should incorporate novel mechanisms a...
We present a two-dimensional RAID architecture that is specifically tailored to the needs of archival storage systems. Our proposal starts with a fairly conventional two-dimensional RAID architecture where each disk belongs to exactly one horizontal and one vertical RAID level 4 stripe. Once the array has been populated, we add a superparity device...
Shingled Magnetic Recording technology is ex-pected to play a major role in the next generation of hard disk drives. But it introduces some unique challenges to system software researchers and prototype hardware is not readily available for the broader research community. It is crucial to work on system software in parallel to hardware manufacturin...
Ultimately the performance and success of a shingled write disk (SWD) will be determined by more than the physical hardware realized, but will depend on the data layouts employed, the workloads experienced, and the architecture of the overall system, including the level of interfaces provided by the devices to higher levels of system software. Whil...
We present a more efficient chaining protocol for video-on-demand applications. Chaining protocols require each client to forward the video data it receives to the next client watching the same video. Unlike all extant chaining protocols, our protocol requires these clients to forward these data at a rate slightly higher than the video consumption...
If the data density of magnetic disks is to continue its current 30–50% annual growth, new recording techniques are required. Among the actively considered options, shingled writing is currently the most attractive one because it is the easiest to implement at the device level. Shingled write recording trades the inconvenience of the inability to u...
Storage class memories (SCMs) constitute an emerging class of non-volatile storage devices that promise to be signifi-cantly faster and more reliable than magnetic disks. We propose to add one of these devices to each group of two or three RAID level arrays and store on it additional parity data. We show that the new organization can tolerate all d...
Disk scrubbing periodically scans the contents of a disk array to detect the presence of irrecoverable read errors and reconstitute the contents of the lost blocks using the built-in redundancy of the disk array. We address the issue of scheduling scrubbing runs in disk arrays that can tolerate two disk failures without incurring a data loss, and p...
We propose to increase the reliability of RAID level 5 arrays used for storing archival data. First, we identify groups of two or three identical RAID arrays. Second, we add to each group a shared parity disk containing the diagonal parities of their arrays. We show that the new organization can tolerate all double disk failures and between 75 and...
We investigate the impact of irrecoverable read errors-also known as bad blocks-on the MTTDL of mirrored disks, RAID level 5 arrays and RAID level 6 arrays. Our study is based on the data collected by Bairavasundaram et al. from a population of 1.53 million disks over a period of 32 months. Our study indicates that irrecoverable read errors can red...
Two-dimensional RAID arrays maintain separate row and column parities for all their disks. Depending on their organization, they can tolerate between two and three concurrent disk failures without losing any data. We propose to enhance the robustness of these arrays by replacing a small fraction of these drives with storage class memory devices, an...
We evaluate the reliability of storage system schemes consisting of an equal numbers of data disks and parity disks where each parity disk contains the exclusive or (XOR) of two or three of the data disks. These schemes are instances of Survivable Storage using Parity in Redundant Array Layouts (SSPiRAL). They have the same storage costs as mirrore...
We describe a novel method to study storage system predictability based on the visualization of conditional entropy. First-order condi-tional entropy can be used as a measure of predictability. It is supe-rior to the more common measures such as independent likelihood of data access. For file access data, we developed a visualization tool that prod...
We evaluate the reliability of storage system schemes consisting of n data disks and n parity disks where each parity disk contains the exclusive or (XOR) of two of the n data disks. These schemes are instances of the so-called SSPiRAL (survivable storage using parity in redundant array layouts). Even though they offer the simplicity of mirroring a...
SUMMARY A cache replacement policy is normally suited to a particular class of applications, or limited to a set of fixed criteria for evaluating the cache-worthiness of an object. We present Universal Caching as a mechanism to capturethegenerality ofthemost adaptive algorithms,whiledependingona very limitedset ofbasiccriteria for cache-replacement...
Power conservationin systems is critical for mobile, sen- sor network, and other power-constrained environments. While disk spin-down policies can contribute greatly to re- ducingthe power consumptionof the storage subsystem, the reshapingof the access workload can actively increase such energy savings. Traditionally reshaping of the access work- l...
The Secure and robust Critical Information-Technology Infrastructure (S-CITI for short) project aims at providing support to Emergency Managers (EMs) that are faced with management of resources and with decisions before, during, and after emergencies or disasters. Our approach consists of using new and existing sensors to gather data from the field...
Wireless sensor networks are expected to be an integral part of any pervasive computing environment. This implies an ever-increasing need for efficient energy and resource management of both the sensor nodes, as well as the overall sensor network, in order to meet the expected quality of data and service requirements. There have been numerous studi...
Interconnected computing nodes in pervasive systems demand efficient management to ensure longevity and effectiveness. This is particularly true when we consider wireless sensor networks, for which we propose a new scheme for adaptive route management. There have been numerous studies that have looked at the routing of data in sensor networks with...
Increasing efforts have been aimed towards the management of power as a critical system resource, and the disk can consume approximately a third of the power required for a typical laptop computer. Mechanisms to manage disk power have included spin-down policies and APIs to modify access workloads to be more power-friendly. In this work we present...
In the context of mobile data access, data caching is fundamental for both performance and functionality. For this reason there have been many studies into developing energy-efficient caching algorithms suitable for specific mobile environments. In this papers, we present a novel caching policy, Universal Mobile Caching (UMC), which is suitable for...
The use of access predictors to improve storage device per- formance has been investigated for both improving access times, as well as a means of reducing energy consumed by the disk. Such predictors also offer us an opportunity to demonstrate the benefits of an adaptive approach to han- dling unexpected workloads, whether they are the result of na...
Data access prediction has been proposed as a mechanism to overcome latency lag, and more recently as a means of conserving energy in mobile systems. We present a fully adaptive predictor, that can optimize itself for any arbitrary workload, while simultaneously offering simple adjustment of goals between energy conservation and latency reduction....
Output of network file servers exhibits bursty traffic patterns, and this sometimes contends with control traffic. An example is contention between data traffic and write acknowledgments on a loaded server. In such a situation, we can improve the write performance by prioritizing write acknowledgments at the network interface of the server. To vali...
File prefetching based on previous file access patterns has been shown to be an effective means of reducing file system latency by implicitly loading caches with files that are likely to be needed in the near future. Mistaken prefetching requests can be very costly in terms of added performance overheads, including increased latency and bandwidth c...
The contribution of this paper is a novel approach to adaptivity that combines alternatives rather than selecting one among alternatives. Using only three, homogenous, cache replacement algorithms, GD-GhOST were able to provide a cache replacement policy that requires no tuning or user-intervention beyond the initial selection of the performance cr...
A popular solution to internet performance problems is the widespread caching of data. Many caching algorithms have been proposed in the literature, most attempting to optimize for one criteria or another, and recent efforts have explored the automation and self-tuning of caching algorithms in response to observed workloads. We extend these efforts...
A popular solution to internet performance problems is the widespread caching of data. Many caching algorithms have been proposed in the literature, most attempting to optimize for one criteria or another, and recent efforts have explored the automation and self-tuning of caching algorithms in response to observed workloads. We extend these efforts...
Most existing studies of file access prediction are experimental in nature and rely on trace driven simulation to predict the performance of the schemes being investigated. We present a first order Markov analysis of file access prediction, discuss its limitations and show how it can be used to estimate the performance of file access predictors, su...
Ahmed Amer A. Luo N. Der- [...]
Alex Pang
We describe our experience graphically visualizing data access behavior, with a specific emphasis on visualizing the predictability of such accesses and the consistency of these observations at the block level. Such workloads are more frequently encountered after filtering through intervening cache levels and in this paper we demonstrate how such f...
Existing file access predictors keep track of previous file access patterns and rely on a single heuristic to predict which of the previous successors to the file being currently accessed is the most likely to be accessed next. We present here a novel composite predictor that applies multiple heuristics to this selection problem. As a result, it ca...
The gap between CPU speeds and the speed of the technologies providing the data is increasing. As a result, latency and bandwidth to needed data is limited by the performance of the storage devices and the networks that connect them to the CPU. Distributed caching techniques are often used to reduce the penalties associated with such caching; howev...
The trend in cache design research is towards finding the single optimum replacement policy that performs better than any other proposed policy by using all the useful criteria at once. However, due to the variety of workloads and system topologies it is daunting, if not impossible, to summarize all this information into one magical value using any...
We describe a novel on-line file access predictor, Recent
Popularity, capable of rapid adaptation to workload changes while
simultaneously predicting more events with greater accuracy than prior
efforts. We distinguish the goal of predicting the most events
accurately from the goal of offering the most accurate predictions (when
declining to Offer...
We describe a way to manage distributed file system caches based upon groups of files that are accessed together. We use file access patterns to automatically construct dynamic groupings of files and then manage our cache by fetching groups, rather than single files. We present experimental results, based on trace-driven workloads, demonstrating th...
We propose a novel method to study storage system predictability based on the visualization of file successor entropy, a form of conditional entropy drawn from a file access trace. First-order conditional entropy can be used as a measure of predictability It is superior to the more common measures such as independent likelihood of data access. For...
We propose a novel method to study storage system pre- dictability based on the visualization of file successor en- tropy, a form of conditional entropy drawn from a file ac- cess trace. First-order conditional entropy is used as a measure of predictability. It is superior to the more com- mon measures such as independent likelihood of data ac- ces...
The gap between CPU speeds and the speed of the technologies providing the data is increasing. As a result, latency and bandwidth to needed data is limited by the performance of the storage devices and the networks that connect them to the CPU. Distributed caching techniques are often used to reduce the penalties associated with such caching; howev...
Prediction is a powerful tool for performance and usability. It
can reduce access latency for I/O systems, and can improve usability for
mobile computing systems by automating the file hoarding process. We
present recent research that has resulted in a file successor predictor
that matches the performance of state-of-the-art context-modeling
predic...
The ability to automatically hoard data on a computer's local store would go a long way towards freeing the mobile user from dependence on the network and potentially unbounded latencies. An important step in developing a fully automated file hoarding algorithm is the ability to automatically identify strong relationships between files. We present...
We introduce the aggregating cache, and demonstrate how it can be used to reduce the number of file retrieval requests made by a caching client, improving storage system performance by reducing the impact of latency. The aggregating cache utilizes predetermined groupings of files to perform group retrievals. These groups are maintained by the serve...
Abstract—While mean time to data loss (MTTDL) provides an easy way to estimate the reliability of redundant disk arrays, itfails to take into account the relatively short lifetime of these arrays. We analyzed five different disk array organizations and compared the reliability estimates obtained using their mean times to data loss with the more exa...