Datasets:

YashaP
/

ACMproject

Modalities:

Text

Formats:

Size:

Libraries:

Dataset card Data Studio Files Files and versions

xet

Community

Dataset Viewer

Auto-converted to Parquet Duplicate

Split (1)

train · 1.99k rows

input stringlengths 331 3.18k	output sequence
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Due to the success of deep learning to solving a variety of challenging machine learning tasks, there is a rising interest in understanding loss functions for training neural networks from a theoretical aspec...	[ "We provide necessary and sufficient analytical forms for the critical points of the square loss functions for various neural networks, and exploit the analytical forms to characterize the landscape properties for the loss functions of these neural networks." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The backpropagation (BP) algorithm is often thought to be biologically implausible in the brain.', 'One of the main reasons is that BP requires symmetric weight matrices in the feedforward and feedback pathwa...	[ "Biologically plausible learning algorithms, particularly sign-symmetry, work well on ImageNet" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We introduce the 2-simplicial Transformer, an extension of the Transformer which includes a form of higher-dimensional attention generalising the dot-product attention, and uses this attention to update entit...	[ "We introduce the 2-simplicial Transformer and show that this architecture is a useful inductive bias for logical reasoning in the context of deep reinforcement learning." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.', 'Long-term forecasting in such systems is highly c...	[ "Accurate forecasting over very long time horizons using tensor-train RNNs" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Recent efforts on combining deep models with probabilistic graphical models are promising in providing flexible models that are also easy to interpret.', 'We propose a variational message-passing algorithm fo...	[ "We propose a variational message-passing algorithm for models that contain both the deep model and probabilistic graphical model." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Modern deep neural networks have a large amount of weights, which make them difficult to deploy on computation constrained devices such as mobile phones.', 'One common approach to reduce the model size and co...	[ "A simple modification to low-rank factorization that improves performances (in both image and language tasks) while still being compact." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deep learning training accesses vast amounts of data at high velocity, posing challenges for datasets retrieved over commodity networks and storage devices.', 'We introduce a way to dynamically reduce the ove...	[ "We propose a simple, general, and space-efficient data format to accelerate deep learning training by allowing sample fidelity to be dynamically selected at training time" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['It is fundamental and challenging to train robust and accurate Deep Neural Networks (DNNs) when semantically abnormal examples exist.', 'Although great progress has been made, there is still one crucial resea...	[ "ROBUST DISCRIMINATIVE REPRESENTATION LEARNING VIA GRADIENT RESCALING: AN EMPHASIS REGULARISATION PERSPECTIVE" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Generative Adversarial Networks (GANs) have achieved remarkable results in the task of generating realistic natural images.', 'In most applications, GAN models share two aspects in common.', 'On the one hand,...	[ "Are GANs successful because of adversarial training or the use of ConvNets? We show a ConvNet generator trained with a simple reconstruction loss and learnable noise vectors leads many of the desirable properties of a GAN." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['In this paper, we propose a novel kind of kernel, random forest kernel, to enhance the empirical performance of MMD GAN.', 'Different from common forests with deterministic routings, a probabilistic routing v...	[ "Equip MMD GANs with a new random-forest kernel." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Reinforcement learning in an actor-critic setting relies on accurate value estimates of the critic.', 'However, the combination of function approximation, temporal difference (TD) learning and off-policy trai...	[ "A method for more accurate critic estimates in reinforcement learning." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We introduce a systematic framework for quantifying the robustness of classifiers to naturally occurring perturbations of images found in videos.', 'As part of this framework, we construct ImageNet-Vid-Robust...	[ "We introduce a systematic framework for quantifying the robustness of classifiers to naturally occurring perturbations of images found in videos." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Structured tabular data is the most commonly used form of data in industry according to a Kaggle ML and DS Survey.', 'Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression ar...	[ "Deep learning for structured tabular data machine learning using pre-trained CNN model from ImageNet." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Learning rich representations from predictive learning without labels has been a longstanding challenge in the field of machine learning.', 'Generative pre-training has so far not been as successful as contra...	[ "Decoding pixels can still work for representation learning on images" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Adaptive regularization methods pre-multiply a descent direction by a preconditioning matrix.', 'Due to the large number of parameters of machine learning problems, full-matrix preconditioning methods are pro...	[ "fast, truly scalable full-matrix AdaGrad/Adam, with theory for adaptive stochastic non-convex optimization" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Dialogue systems require a great deal of different but complementary expertise to assist, inform, and entertain humans.', 'For example, different domains (e.g., restaurant reservation, train ticket booking) o...	[ "In this paper, we propose to learn a dialogue system that independently parameterizes different dialogue skills, and learns to select and combine each of them through Attention over Parameters (AoP). " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Model distillation aims to distill the knowledge of a complex model into a simpler one.', 'In this paper, we consider an alternative formulation called dataset distillation: we keep the model fixed and instea...	[ "We propose to distill a large dataset into a small set of synthetic data that can train networks close to original performance. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We relate the minimax game of generative adversarial networks (GANs) to finding the saddle points of the Lagrangian function for a convex optimization problem, where the discriminator outputs and the distribu...	[ "We propose a primal-dual subgradient method for training GANs and this method effectively alleviates mode collapse." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Specifying reward functions is difficult, which motivates the area of reward inference: learning rewards from human behavior.', 'The starting assumption in the area is that human behavior is optimal given the...	[ "We find that irrationality from an expert demonstrator can help a learner infer their preferences. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Natural Language Processing models lack a unified approach to robustness testing.', 'In this paper we introduce WildNLP - a framework for testing model stability in a natural setting where text corruptions su...	[ "We compare robustness of models from 4 popular NLP tasks: Q&A, NLI, NER and Sentiment Analysis by testing their performance on perturbed inputs." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Training generative models like Generative Adversarial Network (GAN) is challenging for noisy data.', 'A novel curriculum learning algorithm pertaining to clustering is proposed to address this issue in this...	[ "A novel cluster-based algorithm of curriculum learning is proposed to solve the robust training of generative models." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Backdoor attacks aim to manipulate a subset of training data by injecting adversarial triggers such that machine learning models trained on the tampered dataset will make arbitrarily (targeted) incorrect pred...	[ "We proposed a novel distributed backdoor attack on federated learning and show that it is not only more effective compared with standard centralized attacks, but also harder to be defended by existing robust FL methods" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Graph networks have recently attracted considerable interest, and in particular in the context of semi-supervised learning.', 'These methods typically work by generating node representations that are propagat...	[ "Neural net for graph-based semi-supervised learning; revisits the classics and propagates labels rather than feature representations" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Neural architecture search (NAS) has made rapid progress incomputervision,wherebynewstate-of-the-artresultshave beenachievedinaseriesoftaskswithautomaticallysearched neural network (NN) architectures.', 'In c...	[ "Neural Architecture Search for a series of Natural Language Understanding tasks. Design the search space for NLU tasks. And Apply differentiable architecture search to discover new models" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Network embedding (NE) methods aim to learn low-dimensional representations of network nodes as vectors, typically in Euclidean space.', 'These representations are then used for a variety of downstream predic...	[ "In this paper we introduce EvalNE, a Python toolbox for automating the evaluation of network embedding methods on link prediction and ensuring the reproducibility of results." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deep learning models can be efficiently optimized via stochastic gradient descent, but there is little theoretical evidence to support this.', 'A key question in optimization is to understand when the optimiz...	[ "Recovery guarantee of stochastic gradient descent with random initialization for learning a two-layer neural network with two hidden nodes, unit-norm weights, ReLU activation functions and Gaussian inputs." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Dropout is a simple yet effective technique to improve generalization performance and prevent overfitting in deep neural networks (DNNs).', 'In this paper, we discuss three novel observations about dropout to...	[ "Jumpout applies three simple yet effective modifications to dropout, based on novel understandings about the generalization performance of DNN with ReLU in local regions." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks.', 'If sparsity is important for NLP, might we...	[ "We study the natural emergence of sparsity in the activations and gradients for some layers of a dense LSTM language model, over the course of training." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The integration of a Knowledge Base (KB) into a neural dialogue agent is one of the key challenges in Conversational AI.', 'Memory networks has proven to be effective to encode KB information into an externa...	[ "Conventional memory networks generate many redundant latent vectors resulting in overfitting and the need for larger memories. We introduce memory dropout as an automatic technique that encourages diversity in the latent space." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ["Su-Boyd-Candes (2014) made a connection between Nesterov's method and an ordinary differential equation (ODE). ", "We show if a Hessian damping term is added to the ODE from Su-Boyd-Candes (2014), then Neste...	[ "We derive Nesterov's method arises as a straightforward discretization of an ODE different from the one in Su-Boyd-Candes and prove acceleration the stochastic case" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We propose learning to transfer learn (L2TL) to improve transfer learning on a target dataset by judicious extraction of information from a source dataset.', 'L2TL considers joint optimization of vastly-share...	[ "We propose learning to transfer learn (L2TL) to improve transfer learning on a target dataset by judicious extraction of information from a source dataset." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['In many partially observable scenarios, Reinforcement Learning (RL) agents must rely on long-term memory in order to learn an optimal policy.', 'We demonstrate that using techniques from NLP and supervised le...	[ "In Deep RL, order-invariant functions can be used in conjunction with standard memory modules to improve gradient decay and resilience to noise." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Optimization on manifold has been widely used in machine learning, to handle optimization problems with constraint.', 'Most previous works focus on the case with a single manifold.', 'However, in practice it ...	[ "This paper introduces an algorithm to handle optimization problem with multiple constraints under vision of manifold." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['It has long been assumed that high dimensional continuous control problems cannot be solved effectively by discretizing individual dimensions of the action space due to the exponentially large number of bins ...	[ "A method to do Q-learning on continuous action spaces by predicting a sequence of discretized 1-D actions." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Model-based reinforcement learning (MBRL) aims to learn a dynamic model to reduce the number of interactions with real-world environments.', 'However, due to estimation error, rollouts in the learned model, e...	[ "Our method incorporates WGAN to achieve occupancy measure matching for transition learning." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Batch Normalization (BN) and its variants have seen widespread adoption in the deep learning community because they improve the training of deep neural networks.', 'Discussions of why this normalization works...	[ "Gaussian normalization performs a least-squares fit during back-propagation, which zero-centers and decorrelates partial derivatives from normalized activations." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Batch Normalization (BN) has become a cornerstone of deep learning across diverse architectures, appearing to help optimization as well as generalization.', 'While the idea makes intuitive sense, theoretical ...	[ "We give a theoretical analysis of the ability of batch normalization to automatically tune learning rates, in the context of finding stationary points for a deep learning objective." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale.', 'We attempt to carry this success to the field of video modeling by showing that large Ge...	[ "We propose DVD-GAN, a large video generative model that is state of the art on several tasks and produces highly complex videos when trained on large real world datasets." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Understanding procedural language requires anticipating the causal effects of actions, even when they are not explicitly stated.', 'In this work, we introduce Neural Process Networks to understand procedural ...	[ "We propose a new recurrent memory architecture that can track common sense state changes of entities by simulating the causal effects of actions." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression. ', 'In this set...	[ "We investigate the space efficiency of memory-augmented neural nets when learning set membership." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We leverage recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network.', 'Our approximat...	[ "We construct a Kronecker factored Laplace approximation for neural networks that leads to an efficient matrix normal distribution over the weights." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Spectral embedding is a popular technique for the representation of graph data.', 'Several regularization techniques have been proposed to improve the quality of the embedding with respect to downstream tasks...	[ "Graph regularization forces spectral embedding to focus on the largest clusters, making the representation less sensitive to noise. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The exposure bias problem refers to the training-inference discrepancy caused by teacher forcing in maximum likelihood estimation (MLE) training for auto-regressive neural network language models (LM).', 'It ...	[ "We show that exposure bias could be much less serious than it is currently assumed to be for MLE LM training." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks.', 'Her...	[ "A controlled study of the role of environments with respect to properties in emergent communication protocols." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['For understanding generic documents, information like font sizes, column layout, and generally the positioning of words may carry semantic information that is crucial for solving a downstream document intelli...	[ "Grid-based document representation with contextualized embedding vectors for documents with 2D layouts" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.', "However, an attacker is not usually...	[ "Deep RL policies can be attacked by other agents taking actions so as to create natural observations that are adversarial." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['GloVe and Skip-gram word embedding methods learn word vectors by decomposing a denoised matrix of word co-occurrences into a product of low-rank matrices.', 'In this work, we propose an iterative algorithm fo...	[ "We present a novel iterative algorithm based on generalized low rank models for computing and interpreting word embedding models." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deterministic models are approximations of reality that are often easier to build and interpret than stochastic alternatives. \n', 'Unfortunately, as nature is capricious, observational data can never be ful...	[ "We learn a conditional autoregressive flow to propose perturbations that don't induce simulator failure, improving inference performance." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Multi-hop question answering requires models to gather information from different parts of a text to answer a question.', 'Most current approaches learn to address this task in an end-to-end way with neural n...	[ "We improve answering of questions that require multi-hop reasoning extracting an intermediate chain of sentences." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Normalizing constant (also called partition function, Bayesian evidence, or marginal likelihood) is one of the central goals of Bayesian inference, yet most of the existing methods are both expensive and inac...	[ "We develop a new method for normalization constant (Bayesian evidence) estimation using Optimal Bridge Sampling and a novel Normalizing Flow, which is shown to outperform existing methods in terms of accuracy and computational time." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We present a large-scale empirical study of catastrophic forgetting (CF) in modern Deep Neural Network (DNN) models that perform sequential (or: incremental) learning.\n', 'A new experimental protocol is prop...	[ "We check DNN models for catastrophic forgetting using a new evaluation scheme that reflects typical application conditions, with surprising results." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Federated Learning (FL) refers to learning a high quality global model based on decentralized data storage, without ever copying the raw data.', 'A natural scenario arises with data created on mobile phones b...	[ "Federated Averaging already is a Meta Learning algorithm, while datacenter-trained methods are significantly harder to personalize." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Memorization of data in deep neural networks has become a subject of significant research interest. \n', 'In this paper, we link memorization of images in deep convolutional autoencoders to downsampling thro...	[ "We identify downsampling as a mechansim for memorization in convolutional autoencoders." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Reinforcement learning provides a powerful and general framework for decision\n', 'making and control, but its application in practice is often hindered by the need\n', 'for extensive feature and reward engin...	[ "We propose an adversarial inverse reinforcement learning algorithm capable of learning reward functions which can transfer to new, unseen environments." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We consider two questions at the heart of machine learning; how can we predict if a minimum will generalize to the test set, and why does stochastic gradient descent find minima that generalize well?', 'Our w...	[ "Generalization is strongly correlated with the Bayesian evidence, and gradient noise drives SGD towards minima whose evidence is large." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['In the industrial field, the positron annihilation is not affected by complex environment, and the gamma-ray photon penetration is strong, so the nondestructive detection of industrial parts can be realized.'...	[ "adversarial nets, attention mechanism, positron images, data scarcity" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We revisit the Recurrent Attention Model (RAM, Mnih et al. (2014)), a recurrent neural network for visual attention, from an active information sampling perspective. \n\n', 'We borrow ideas from neuroscience ...	[ " Inspired by neuroscience research, solve three key weakness of the widely-cited recurrent attention model by simply adding two terms on the objective function." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.', 'However, a larg...	[ "This paper introduces a clustering-based active learning algorithm on graphs." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation.', 'However, conditioning CNFs on s...	[ "We propose the InfoCNF, an efficient conditional CNF that employs gating networks to learn the error tolerances of the ODE solvers " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['A central goal of unsupervised learning is to acquire representations from unlabeled data or experience that can be used for more effective learning of downstream tasks from modest amounts of labeled data.', ...	[ "An unsupervised learning method that uses meta-learning to enable efficient learning of downstream image classification tasks, outperforming state-of-the-art methods." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Domain transfer is a exciting and challenging branch of machine learning because models must learn to smoothly transfer between domains, preserving local variations and capturing many aspects of variation wit...	[ "Conditional VAE on top of latent spaces of pre-trained generative models that enables transfer between drastically different domains while preserving locality and semantic alignment." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We propose Adversarial Inductive Transfer Learning (AITL), a method for addressing discrepancies in input and output spaces between source and target domains.', 'AITL utilizes adversarial domain adaptation an...	[ "A novel method of inductive transfer learning that employs adversarial learning and multi-task learning to address the discrepancy in input and output space" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Named entity recognition (NER) and relation extraction (RE) are two important tasks in information extraction and retrieval (IE & IR).', 'Recent work has demonstrated that it is beneficial to learn these task...	[ "A novel, high-performing architecture for end-to-end named entity recognition and relation extraction that is fast to train." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['In this work we explore a straightforward variational Bayes scheme for Recurrent Neural Networks.\n', 'Firstly, we show that a simple adaptation of truncated backpropagation through time can yield good qualit...	[ " Variational Bayes scheme for Recurrent Neural Networks" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Over the passage of time Unmanned Autonomous Vehicles (UAVs), especially\n', 'Autonomous flying drones grabbed a lot of attention in Artificial Intelligence.\n', 'Since electronic technology is getting smalle...	[ "case study on optimal deep learning model for UAVs" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Music relies heavily on repetition to build structure and meaning. ', 'Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA ...	[ "We show the first successful use of Transformer in generating music that exhibits long-term structure. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Sequential decision problems for real-world applications often need to be solved in real-time, requiring algorithms to perform well with a restricted computational budget.', 'Width-based lookaheads have shown...	[ "We propose a new Monte Carlo Tree Search / rollout algorithm that relies on width-based search to construct a lookahead." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deep Neural Networks (DNNs) are known for excellent performance in supervised tasks such as classification.', 'Convolutional Neural Networks (CNNs), in particular, can learn effective features and build high-...	[ "We propose a novel deep neural network layer for normalising within-class covariance of an internal representation in a neural network that results in significantly improving the generalisation of the learned representations." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic looking images.', 'A fundamental characteristic of generative model...	[ "The addition of a diversity criterion inspired from DPP in the GAN objective avoids mode collapse and leads to better generations. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation ...	[ "We suggest a generalization bound that could partly explain the improvement in generalization with over-parametrization." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We introduce three generic point cloud processing blocks that improve both accuracy and memory consumption of multiple state-of-the-art networks, thus allowing to design deeper and more accurate networks.\n\n...	[ "We introduce three generic point cloud processing blocks that improve both accuracy and memory consumption of multiple state-of-the-art networks, thus allowing to design deeper and more accurate networks." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['End-to-end acoustic-to-word speech recognition models have recently gained popularity because they are easy to train, scale well to large amounts of training data, and do not require a lexicon.', 'In addition...	[ "Methods to learn contextual acoustic word embeddings from an end-to-end speech recognition model that perform competitively with text-based word embeddings." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Unsupervised monocular depth estimation has made great progress after deep\n', 'learning is involved.', 'Training with binocular stereo images is considered as a\n', 'good option as the data can be easily obt...	[ "This paper propose a mask method which solves the previous blurred results of unsupervised monocular depth estimation caused by occlusion" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Graph classification is currently dominated by graph kernels, which, while powerful, suffer some significant limitations.', 'Convolutional Neural Networks (CNNs) offer a very appealing alternative.', 'However...	[ "We introduce a novel way to represent graphs as multi-channel image-like structures that allows them to be handled by vanilla 2D CNNs." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ever-improving ability to model intricate long-term...	[ "We propose a measure of long-term memory and prove that deep recurrent networks are much better fit to model long-term temporal dependencies than shallow ones." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Holistically exploring the perceptual and neural representations underlying animal communication has traditionally been very difficult because of the complexity of the underlying signal.', 'We present here a ...	[ "We compare perceptual, neural, and modeled representations of animal communication using machine learning, behavior, and physiology. " ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The information bottleneck principle (Shwartz-Ziv & Tishby, 2017) suggests that SGD-based training of deep neural networks results in optimally compressed hidden layers, from an information theoretic perspect...	[ "The Information Bottleneck Principle applied to ResNets, using PixelCNN++ models to decode mutual information and conditionally generate images for information illustration" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We study the problem of safe adaptation: given a model trained on a variety of past experiences for some task, can this model learn to perform that task in a new situation while avoiding catastrophic failure?...	[ "Adaptation of an RL agent in a target environment with unknown dynamics is fast and safe when we transfer prior experience in a variety of environments and then select risk-averse actions during adaptation." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns visual concepts, words, and semantic parsing of sentences without explicit supervision on any of them; instead, our model learns by s...	[ "We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns visual concepts, words, and semantic parsing of sentences without explicit supervision on any of them." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Bayesian inference offers a theoretically grounded and general way to train neural networks and can potentially give calibrated uncertainty.', 'However, it is challenging to specify a meaningful and tractable...	[ "We introduce a Gaussian Process Prior over weights in a neural network and explore its ability to model input-dependent weights with benefits to various tasks, including uncertainty estimation and generalization in the low-sample setting." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We perform an in-depth investigation of the suitability of self-attention models for character-level neural machine translation.', 'We test the standard transformer model, as well as a novel variant in which ...	[ "We perform an in-depth investigation of the suitability of self-attention models for character-level neural machine translation." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The field of medical diagnostics contains a wealth of challenges which closely resemble classical machine learning problems; practical constraints, however, complicate the translation of these endpoints naive...	[ "we present the state-of-the-art results of using neural networks to diagnose chest x-rays" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Semmelhack et al. (2014) have achieved high classification accuracy in distinguishing swim bouts of zebrafish using a Support Vector Machine (SVM).', 'Convolutional Neural Networks (CNNs) have reached superio...	[ "We demonstrate the utility of a recent AI explainability technique by visualizing the learned features of a CNN trained on binary classification of zebrafish movements." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['When communicating, humans rely on internally-consistent language representations.', 'That is, as speakers, we expect listeners to behave the same way we do when we listen.', 'This work proposes several metho...	[ "Internal-consistency constraints improve agents ability to develop emergent protocols that generalize across communicative roles." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Neural networks (NNs) are able to perform tasks that rely on compositional structure even though they lack obvious mechanisms for representing this structure.', 'To analyze the internal representations that e...	[ "We introduce a new analysis technique that discovers interpretable compositional structure in notoriously hard-to-interpret recurrent neural networks." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['The vertebrate visual system is hierarchically organized to process visual information in successive stages.', 'Neural representations vary drastically across the first stages of visual processing: at the out...	[ "We reproduced neural representations found in biological visual systems by simulating their neural resource constraints in a deep convolutional model." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['While it has not yet been proven, empirical evidence suggests that model generalization is related to local properties of the optima which can be described via the Hessian.', 'We connect model generalization ...	[ "a theory connecting Hessian of the solution and the generalization power of the model" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Unsupervised learning is about capturing dependencies between variables and is driven by the contrast between the probable vs improbable configurations of these variables, often either via a generative model ...	[ "We introduced entropy maximization to GANs, leading to a reinterpretation of the critic as an energy function." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Neural Style Transfer has become a popular technique for\n', 'generating images of distinct artistic styles using convolutional neural networks.', 'This\n', 'recent success in image style transfer has raised ...	[ "We present a long time-scale musical audio style transfer algorithm which synthesizes audio in the time-domain, but uses Time-Frequency representations of audio." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['To communicate with new partners in new contexts, humans rapidly form new linguistic conventions.', 'Recent language models trained with deep neural networks are able to comprehend and produce the existing co...	[ "We propose a repeated reference benchmark task and a regularized continual learning approach for adaptive communication with humans in unfamiliar domains" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Traditional set prediction models can struggle with simple datasets due to an issue we call the responsibility problem.', 'We introduce a pooling method for sets of feature vectors based on sorting features a...	[ "Sort in encoder and undo sorting in decoder to avoid responsibility problem in set auto-encoders" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['We present a method for policy learning to navigate indoor environments.', 'We adopt a hierarchical policy approach, where two agents are trained to work in cohesion with one another to perform a complex navi...	[ "We present a hierarchical learning framework for navigation within an embodied learning setting" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Saliency methods aim to explain the predictions of deep neural networks.', 'These methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction.', 'We us...	[ "Attribution can sometimes be misleading" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Large Transformer models routinely achieve state-of-the-art results on\n', 'a number of tasks but training these models can be prohibitively costly,\n', 'especially on long sequences.', 'We introduce two tech...	[ "Efficient Transformer with locality-sensitive hashing and reversible layers" ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Obtaining policies that can generalise to new environments in reinforcement learning is challenging.', 'In this work, we demonstrate that language understanding via a reading policy learner is a promising veh...	[ "We show language understanding via reading is promising way to learn policies that generalise to new environments." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['An open question in the Deep Learning community is why neural networks trained with Gradient Descent generalize well on real datasets even though they are capable of fitting random data.', 'We propose an appr...	[ "We propose a hypothesis for why gradient descent generalizes based on how per-example gradients interact with each other." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: [' Recent advances in deep learning have shown promising results in many low-level vision tasks.', 'However, solving the single-image-based view synthesis is still an open problem.', 'In particular, the generat...	[ "Novel architecture for stereoscopic view synthesis at arbitrary camera shifts utilizing adaptive t-shaped kernels with adaptive dilations." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Deep Neutral Networks(DNNs) require huge GPU memory when training on modern image/video databases.', 'Unfortunately, the GPU memory as a hardware resource is always finite, which limits the image resolution, ...	[ "This paper proposes fundamental theory and optimal algorithms for DNN training, which reduce up to 80% of training memory for popular DNNs." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Compression is a key step to deploy large neural networks on resource-constrained platforms.', 'As a popular compression technique, quantization constrains the number of distinct weight values and thus reduci...	[ "This paper proves the universal approximability of quantized ReLU neural networks and puts forward the complexity bound given arbitrary error." ]
### Instruction->PROVIDE ME WITH SUMMARY FOR THE GIVEN INPUT WHILE KEEPING THE MOST IMPORTANT DETAILS INTACT: ['Reinforcement learning (RL) with value-based methods (e.g., Q-learning) has shown success in a variety of domains such as\n', 'games and recommender systems (RSs).', 'When the action space is finite, these al...	[ "A general framework of value-based reinforcement learning for continuous control" ]

End of preview. Expand in Data Studio

README.md exists but content is empty.

Downloads last month: 3

Size of downloaded dataset files:

1.41 MB

Size of the auto-converted Parquet files:

1.41 MB

Number of rows:

1,992